My (Long) Story
Last Updated: January 2023
I grew up in a closet-sized apartment on the 5th floor of a 24-story apartment complex in New York City with two parents and a fraternal twin brother. While I lived in Manhattan, I spent almost no time at home. I went to high school in Brooklyn, played sports in the Bronx and Queens, visited relatives in Staten Island, and paid frequent visits to my grandmother in New Jersey. I got roughed up and hardened by the New York City public school system and Metropolitan Transit Authority. The sound of “stand clear of the closing doors, please” haunts me in my dreams.
Entrepreneurship runs in the family. After graduating from podiatry school in the 1970s, my dad took out a loan, refurbished a storefront, and started a private medical practice on the Upper East Side of Manhattan. When I was old enough, my first job was to be his medical assistant, filing paperwork and standing over the operating table as my dad trimmed the toenails of wealthy, eccentric Manhattanites. In my pseudo-residency, dad would methodically teach and quiz me about all the different problems of his patients. Eventually, to the chagrin of his patients, I would give spot diagnoses.
From ages 1 to 18, all of my time was spent training, thinking about, and competing in the sport of tennis. All of my friends were tennis players. All of my family members were tennis spectators. When I was 12, my brother and I designed some business cards and started “Selig Stringing Services.” With a loan from our parents, we bought a tennis racquet stringing machine and began offering tennis racquet repair services to people in our neighborhood. My parents indulged in my eclectic interests and pushed me along the way, hoping that all the effort would pay off with a college scholarship. Although, I had bigger dreams and wanted to become a professional tennis player.
Despite placing high in the rankings for my region, in a pivotal moment, I was turned away by the McEnroe Tennis Academy for not meeting physical requirements (being short and scrawny runs in the family). It was clear that athletics was not my forte. Though, having built up plenty of career capital, I made the best of my skills. From ages 14 to 18, I worked my third job as a ballboy at the U.S. Open (this time with a real paycheck). It was at the U.S. Open where I high-fived Novak Djokavic, patted the head of Roger Federer, was shouted at by Kyrgios, held up the American flag for Serena Williams after her devastating 2011 defeat, and strangely, danced with Carly Rae Jepsen on live television.
Photo: Standing between Serena Williams and Samantha Stosur - U.S. Open 2011 Women’s Final
I never thought of myself as an intelligent person. Actually, quite the opposite. Up until high school, I had unremarkable grades and an inability to focus for moderate amounts of time. I remember being pulled aside one day in grade school to work with the “special” kids after failing to memorize and recite answers to basic history questions in front of the rest of the class. When I failed to get into my high school of choice, I ended up going to an equally rigorous, but incredibly large high school of 7000 students known as Brooklyn Tech. I vowed to work my a** off to prove I was a better student.
Most of my childhood, I thought I’d become a medical doctor. At Brooklyn Tech, students were required to select a major in which to concentrate their studies. Naturally, I chose Biological Sciences. I studied the equivalent of a pre-medical curriculum and worked my way to the near top of the class. Circumstances changed, however, when I was introduced to Electrical Engineering in an introductory course on Digital Logic. At a parent-teacher conference at the end of the school year, Mr. Grosshart pointed to the top line of an Excel spreadsheet with student rankings on his computer screen and remarked to my mother, “Look here, I have over 200 students. This is your son. If that doesn’t mean something, I don’t know what does.”
Despite what would appear as success, I was a social outcast. All of my time was spent either studying or playing tennis at the expense of making any friends whatsoever. When I graduated, I held my major’s banner at our ceremony. From that day forward, I made my second vow to change. This time, it wouldn’t have to do with academics.
Mr. Grosshart was instrumental in convincing me of my technical skill. There was, in fact, a world outside of medical school. When it came time to apply for college, my parents wanted me to stay in New York. I wanted to move as far away as possible. They urged me to apply to schools in-state. I applied to Cornell University in Ithaca NY, a remote region measurably distant from NYC, almost bordering Canada. However, the College of Engineering waitlisted me.
I composed and sent the Cornell University admissions office a letter strongly asserting my interest. To my luck, I was awarded a seat (I would later learn that such was a rarity).
When I got to Cornell, I did everything that would make the old Justin scream in terror. I ran in elections for various student governments (seven times), made as many friends as I possibly could, and took on leadership positions that diametrically opposed my old personality. The overcompensation turned out to work, and among the chaos, I found myself a much happier, much more fulfilled human.
4.5 years, 4 internships, 8 on-campus jobs, 5 club-leadership positions, and 1 mental health crisis later, I finished my Bachelors and Masters in Electrical and Computer Engineering. As the saying goes, “getting into Cornell is hard, but getting out is harder.”
By the end of college, I accumulated enough experience to become a dangerous asset on the job market - having worked internships at SpaceX, Uber ATG, and several other high-tech companies. In each instance, I saw what working a job at a big company was like. It wasn’t for me.
Photo: Photobombing a SpaceX webcast
Like a properly risk-prone college senior, I bet my early career on a technology sector that no one had ever heard of: AI hardware accelerators. The gateway decision based largely on gut instinct. In my 2017 internship at Uber ATG (aside: an impressively bizarre summer   ), I asked my manager what he thought I should spend my grad degree on…
“Learn to program in CUDA,” he suggested, emphasizing a skill needed to land a full-time role with his team.
“Alright, that’s what I’ll do.”
It was prescient advice. While Uber ATG was transitioning to GPUs for running their AI algorithms in CUDA, so was the rest of the self-driving industry.
I dedicated my grad degree to studying the intersection of computer architecture and artificial intelligence (AI). And of course, I learned CUDA. Somewhere along the way, I picked up a paper authored by researchers at Google unveiling the TPU, a computer chip for running AI algorithms in the datacenter. In that paper was a sentence that, as far as I can tell, changed the silicon electronics industry for a lifetime:
"a projection where people use voice search for 3 minutes a day using speech recognition DNNs would require our datacenters to double to meet computation demands"
It was clear to me and a generation of silicon entrepreneurs that, while GPUs were inevitable, ASICs - more specifically, AI chips - would be their disruptors. I doubled down on my studies.
The details are fuzzy. I remember a hallway conversation with a classmate on post-grad plans:
Joe: "How's the job search going?"
Justin: "Likely going back to Uber ATG. I'm interested in doing AI software."
Joe: "That's cool. My friend just took a job at some startup building a chip for AI."
Startup? AI chip? My ears perked up. I got the contact info of that company’s CTO and cold-emailed him repeatedly until receiving a response akin to, “what do you want, kid?”.
The CTO was Bill Lynch and the company was Cerebras Systems.
AI Kernel Engineer
I convinced Bill that it was worth his time to interview me for a position as an embedded software engineer. My prior experience pointed to classical software as the entry-point to a full-time role at Cerebras. Although, I still really wanted to build AI software.
They sent me a coding project and I spent the next week refining my submission round-the-clock. By the end of 2017, with a student loan in one hand and laptop in the other, I made my way to Silicon Valley and started as the first new grad hire to the hardware systems org at Cerebras. I’d be the lowest-paid engineer at the company. But it didn’t matter. I wanted to be there.
Then came the bait and switch. Having built up requisite expertise in GPU programming, it was clear that I had use outside of embedded software. On my first day, I was placed onto a team of three
terrifying remarkable engineers hired out of the fabled Nervana Systems. As fate would have it, I’d be responsible for building the kernel software underlying the AI algorithms that ran on Cerebras’ chip. The main requirement: a deep familiarity with CUDA and AI.
Yet again, as the youngest employee in my division, I was rife with imposter syndrome. My teammates spoke in terms of math equations and supercomputer algorithms - a far cry from the bits and bytes I knew as a firmware engineer. But that insecurity was channeled into hyper-productivity. Soon, I took on significant projects with an expanding scope of responsibility. I managed teams, architected large software projects, and sat on the front-lines writing mission-critical code.
A few years later, the entrepreneur gene kicked in.
In my first meeting with Cerebras’ CEO, I vowed to “do whatever it takes to create the most value.” But after delivering that value as a purebred engineer, it was clear that I had value beyond an individual contributor role. I turned my attention to strategy. Cerebras faced an existential threat in the form of CUDA, NVIDIA’s flagship software platform for kernel development. NVIDIA’s success could be largely attributed to this almost 20-year old software platform. Any new AI models in the community were built using CUDA by default. NVIDIA had an unfair advantage and no one (arguably still) would have a way to compete.
While Cerebras would excel on hardware, we’d die by the hand of software – a lesson every AI chip company must now reconcile with. While there was a technical path forward, it wouldn’t be easy. We’d need to widen the aperture of developers to increase software contributions to compete against CUDA… not reactively produce kernels in response to customer demands. For that, we needed a way to capture skilled kernel developers in the public. We needed our own CUDA.
So, in 2019, I went to the Product team with a 50-slide deck outlining a vision to create an SDK for external developers. Leadership was immediately sold. (Just kidding, it took almost a year to convince people to resource the project. In retrospect, it was foolish to think that deck would lead anywhere… the proposal would’ve been a massive undertaking). The Product team eventually lobbied around the idea and a star team member, Jessica, took me under her wing.
We refined that initial vision and fought to get the SDK off the ground. We built a team and soon other leadership extended support. In an ironic twist of fate, I was now in a position to publicly replace the very same software that got me to Cerebras. And no company, other than Cerebras, would have a better chance at succeeding.
With more responsibility comes more visibility. And the SDK team had no shortage of both. Among other commercial goals, building a new hardware/software platform meant enabling new fundamental research as a measure of success. We did that too. One of my favorite stats: stencil computation (the backbone of CFD simulation) using the SDK performs >200x faster on a single CS-2 than the same algorithm run on any size supercomputer.
Edit (2023): A Cerebras paper using the SDK is a finalist for the Gordon Bell Prize.
For the next three years, we built a startup within a startup. I led functions in Engineering, Product, Marketing, Sales, and Support, personally onboarding every customer until my last day on the job. This, as it turns out, was the perfect setup for my new life as a…
To say that “everything led to this moment” is over-the-top and approximates hindsight bias (dammit, just said it).
But in all seriousness, I’m fortunate to work a job at Eclipse that, until now, represented a special obsession: discovering and investing in generational technology companies.
Some of the things I think about nowadays include AI infrastrucure, semiconductors, next-gen datacenters, and medical imaging. If you’re reading this and want to trade insights in these domains, let’s chat. If you happen to have some esoteric obsession over something that could change the world, we should grab tea ☕. Be warned, you might learn more than you want about MRI, meditation, or monolithic computer chips.
Until the next update… cheers to more stories to tell,