A Little Green Grid
A small walkthrough of the GitHub contribution component on my homepage: why it scrapes GitHub's public HTML, how it fails safely, and the little test that keeps it from quietly breaking.
I added a GitHub activity graph to the homepage because I wanted the site to feel a little more alive.
Not in the fake "currently listening to" way, and not with a dashboard full of vanity stats. Just a small, familiar signal: here is the shape of the work. Some days are quiet. Some days are darker. The grid tells that story without needing much copy around it.
The funny part is that GitHub does not give you a clean public API for this exact little chart. At least not one I wanted to wire OAuth around for a portfolio homepage. So the component takes the boring route: it reads the same public contribution HTML GitHub already renders for a user profile, extracts the useful bits, and turns them into my own UI.
This post is the walkthrough I wish I had written before I forgot the details.
The data source is just GitHub's public contributions page
The whole thing starts here:
const GITHUB_CONTRIBUTIONS_URL = 'https://github.com/users/ftchvs/contributions';
That endpoint returns a small HTML fragment with the contribution calendar. If you have ever inspected a GitHub profile, you have seen the shape already. Each day is a table cell with attributes like:
<td
class="ContributionCalendar-day"
data-date="2026-05-07"
data-level="4"
></td>
For my use case, I only need two fields:
data-date, so I know where the square belongs.data-level, so I know how intense the color should be.
I also pull the total contribution count from GitHub's heading text. It is not fancy. It is deliberately not fancy.
export type GitHubContributionDay = {
date: string;
level: 0 | 1 | 2 | 3 | 4;
};
export type GitHubContributionCalendar = {
totalContributions: number | null;
days: GitHubContributionDay[];
};
The parser scans for table cells with ContributionCalendar-day, grabs the two attributes, rejects anything weird, and returns a plain object the homepage can render.
The important part is the rejection. If GitHub changes something and starts returning a level of 9, I do not want that leaking into the UI as a surprise CSS state. It gets ignored.
Why scrape HTML instead of using the GitHub API?
Because this is a public decorative component, not core product infrastructure.
The GitHub GraphQL API can return contribution data, but then the homepage needs credentials, token handling, rate-limit behavior, and another secret in the deployment path. That is too much ceremony for a grid whose worst acceptable failure mode is "show nothing."
The public HTML route has tradeoffs:
Good:
- No OAuth.
- No private token.
- Easy to cache.
- Easy to inspect in a browser.
- Good enough for a portfolio homepage.
Bad:
- GitHub can change the markup.
- The parser is coupled to class names and attributes.
- The page now depends on a third-party request during render.
That last point is the one that matters most. If GitHub is slow, my homepage should not be slow with it.
The fetch has a deadline
The original version did a normal fetch(). That worked, until I thought about the first screen of the site waiting on GitHub for no good reason.
So now the request has a short timeout:
const GITHUB_CONTRIBUTIONS_TIMEOUT_MS = 2500;
Under the hood it uses an AbortController:
async function fetchWithTimeout(
input: RequestInfo | URL,
init: RequestInit,
): Promise<Response> {
const controller = new AbortController();
const timeoutId = setTimeout(() => {
controller.abort();
}, GITHUB_CONTRIBUTIONS_TIMEOUT_MS);
try {
return await fetch(input, {
...init,
signal: controller.signal,
});
} finally {
clearTimeout(timeoutId);
}
}
Two and a half seconds is already generous. If GitHub has not responded by then, the contribution grid is not important enough to keep waiting.
There is also a Next.js cache hint:
next: { revalidate: 300 }
So the page can reuse the result for five minutes. I do not need minute-by-minute contribution accuracy. Nobody does. If I push a commit and the square updates a few minutes later, that is fine.
Failure is a valid state
The component has one rule: GitHub activity should never break the homepage.
So every bad path returns the same boring fallback:
function emptyContributionCalendar(): GitHubContributionCalendar {
return {
totalContributions: null,
days: [],
};
}
HTTP error? Empty calendar.
Timeout? Empty calendar.
Markup changed? Probably empty calendar, or a partial one.
The homepage then renders the rest of the hero normally and swaps the summary copy to an unavailable state. The activity grid is a nice detail. It is not load-bearing.
This is the part I care about most in tiny components like this. Small visual flourishes are allowed to be fragile internally, but they should be boring externally. Users should not see the fragility.
Turning days into a grid
Once the parser returns a list of { date, level }, the homepage code groups those days into the GitHub-style calendar layout.
Dates become UTC dates so the grid does not drift by local timezone. Then each day gets placed by:
- week index: how many weeks since the first visible day
- row index: day of week
- level: color intensity
The UI also builds month labels and weekday labels with Intl.DateTimeFormat, using the current locale. That means the same component can sit on localized pages without hardcoding English labels.
The rendering itself is intentionally separate from the fetching. The page asks for data, shapes it, and passes simple props to the visual component. That keeps the network/parsing mess out of the UI layer.
I like this pattern: ugly edge of the world on one side, boring typed data on the other.
The regression test is a fake GitHub page
After adding the timeout, I added a tiny script:
npm run check:github-contributions
It does three things.
First, it feeds the parser a small fake GitHub calendar:
<td
tabindex="0"
data-date="2026-05-06"
data-level="0"
class="ContributionCalendar-day"
></td>
<td data-level="4" class="ContributionCalendar-day" data-date="2026-05-07"></td>
<td data-level="9" class="ContributionCalendar-day" data-date="2026-05-08"></td>
The data-level="9" is there on purpose. It proves the parser ignores invalid contribution levels instead of trusting the page blindly.
Second, the script replaces globalThis.fetch with a fake 404 response and checks that the component falls back to an empty calendar.
Third, it replaces fetch with a request that never resolves until the abort signal fires. That checks the timeout path without waiting on a real network call.
This is not a big test suite. It is just enough to catch the failures I actually expect:
- GitHub returns something non-200.
- GitHub is slow.
- GitHub's markup shifts in a way that breaks the parser.
Then the script runs in the Validate workflow before build.
What I would change if this became important
If this were a product feature, I would not scrape public HTML. I would use the GitHub API, store the result, and make the homepage read from my own cache. Maybe a scheduled job would refresh it. Maybe I would track failures and alert on parser drift.
But for a personal site, that would be overbuilt.
The current version has the shape I want:
- no secret required
- bounded latency
- cached for a few minutes
- typed parser output
- graceful fallback
- one focused regression script
That is enough infrastructure for a little green grid.
The lesson is not "scrape GitHub." The lesson is smaller: when a decorative component depends on someone else's markup, give it a deadline, validate what you parse, and make failure look intentional.
Tiny components deserve engineering judgment too. Just not too much of it.