Code, Culture & Scale: A small Team with Big Data
Prologue
The discipline of data science, as the name suggests, is built on the foundations of empiricism — the observability and repetition of patterns in data that help us make informed decisions. However, building a team, especially from the ground up for data science consultation service, is a lot more than applying the rules deduced for an observed scientific phenomenon because of the randomness of the human element that renders the service. One could argue that the random error is too significant to be ignored while being difficult to estimate or predict — ergo, team building at any ambitious organization becomes more of an art than science (at least until the neuralink with a scoring module for service delivery arrives).
Humor aside, this is precisely the challenge we at DSRS had to contend with. How do we build a team that leverages state of the art tools and that is adept at addressing today's research needs and challenges? How do we do this while also minimizing the variance in service delivery? All the while, making sure that we plan and account for employee turnover as a predominantly student led organization. These are the travails of DSRS in a nutshell.
While some of this may feel like broad strokes, I will try to be as specific as possible whenever the details matter. This post is an effort to document our journey, for the researchers who intend to work with us and for the ones that we worked with in the past. It hopes to inspire practitioners interested in building scalable data teams and, most importantly, our interns that are keen to push data science research frontiers.
Code
"A leader leads by example, not by force." — Sun Tzu
So, if the question comes down to how we lead and inspire a team in the agentic era, the answer may sound so simple that it may at times be ignored. Case in point is our response to the increasingly frequent generative AI model launches.
While our initial approach at DSRS was to observe and deploy the releases of the latest open source models that can fit on our hardware, we soon realized that the more scalable approach was to build harnesses around the increasingly frequent releases. In that regard, we are working towards building Atlas — an agentic base layer to run multiple parallel agents, that could be called upon from any given interface. One of the first implementation examples of this is Atlas for Pitchbook, where we deployed the framework in service of researchers trying to explore the dataset. Additionally, we have built our harness in such a way that, depending on the need, our interns can build an agent on this base layer with custom rules by simply inheriting the base agent class.
Code, so to speak, has an impact on how we approach culture and scale — especially in an organization whose goal is to build products that are modular and scalable. In a way, code — the way we architect a solution — also becomes a way to influence culture and scale.
Culture
"Culture does not make people. People make culture." — Chimamanda Ngozi Adichie
Internally at DSRS, we have always referred to ourselves as a fledgling startup — highlighting our penchant for experimentation and improvement based on observations. Thinking back, this was the bedrock of all our ambition: creating an environment for our student interns to learn, thrive and excel all while keeping the big picture in mind — to help improve the research output at Gies.
Building an incredible culture at DSRS begins with hiring interns that are not just talented, but ones that can also think on their feet and communicate. Given the significance of what we are trying to build, it has been extremely important for us to make the hiring process interesting as well as inclusive (why every one of our applicants gets to take a screening test) and to be more personable (why all our hiring evaluations are made by human reviewers), while leveraging AI to help identify our candidates' strengths and weaknesses.
Creating a great culture doesn't just involve code, it depends on inscribing the mundane. Building a team with multi-cultural backgrounds needs, at times, stating the obvious. In that regard, we have a code of conduct for our interns that helps underline some basic workplace expectations and describes communication with our stakeholders.
It is hard to ignore the impact of the agentic era on an organization that rests on data. To keep up with the changes in the current environment, we are working towards a Data Governance Policy that also helps classify all the assets that are created, managed and owned by DSRS.
Scale
"Sunlight is the best disinfectant." — Louis Dembitz Brandeis
Sometimes, our ambition to deliver services and products with impeccable standards has led us in a path where it gets difficult for our stakeholders to comprehend our skills and capabilities. To help mitigate this, we are currently working on a blockchain based platform that helps improve our transparency by showcasing all the projects that we are working on, along with their current status and expected delivery timelines. We believe that this approach would also go a long way in answering a pertinent issue that we have faced since our inception — making sure that we deploy our limited funds in endeavors that are more impactful, along with the ability to estimate costs for each undertaking.
Epilogue
Our journey has been arduous and enlightening, and in hindsight we have had big wins and incredible learning opportunities (whenever we failed) over the years. This is possibly the elusive nature of perfection. To paraphrase our director, Matias Carrasco Kind, we are a small, determined team driven by an infinite ambition to make an impact.
May the force be with us.
What are your thoughts on this blog post? You can write to us at dsrs@business.illinois.edu.
