Yossi joined Fulcrum Genomics in July 2015. Throughout his career he has enjoyed exploring complex data, learning about new technologies, and improving analysis methods. We recently sat down to chat about work and life as a bioinformatics consultant.
What’s your area of expertise, and what excites you about your work?
Yossi: In broad strokes, the analysis part of ‘Data Analysis’. Coming up with relevant metrics and QC for data, especially when there’s a novel element and we don’t yet have a standard. When it’s a novel assay, a new machine, a novel use case, it’s any novel combination of things, novel experimental design, whatever it is, we’re thinking about how to help our client capture something that hasn’t been done enough to know how to do well. That’s usually my speciality.
Other things I do a lot because I have a lot of experience is sample identity, metadata accuracy, contamination investigations in general. Not new, but a constant thread, and something that’s often otherwise neglected.
I love learning about our client’s process and challenges. Whatever it is there’s usually something hidden and I delight in the unknown and unclear and not-working, the exciting bit for me is figuring out those challenges.
What’s a common challenge in our industry that people don’t talk about enough?
One of the biggest challenges we face, and one that directly affects both our work and the results, is being brought in too late. The experiment’s already been run, the data are in, and we’re left wishing we could have helped shape the study design. Instead we’re untangling batch effects, missing context, and avoidable quality issues. Sometimes low-quality data is just what you’ve got — every dataset has its limits, and you work with it. But other times the quality is poor due to how the experiment was done, it wasn’t part of the experimental design, it wasn’t part of the plan, it just happened. We are often asked to participate when things are going wrong, and while we’re good at that, it would have been a lot easier if we’d been brought in earlier. As the saying goes, “Statistics is an autopsy of data” — and this message might be our homage to pilot studies: a reminder that a little upfront planning can save a lot of pain later on.
What’s one tool, tip, or mindset shift that has made a big impact in your work?
There’s this idea some mathematicians and data analysts have, that you don’t need to understand the data, that your analysis can be totally agnostic to its meaning. But I’ve found the opposite to be true. There are so many ways to analyze a dataset, so many interpretations you could make, that if you don’t understand where the data came from or what it’s meant to support, you’re going to struggle. If you just translate the request literally, you can end up with something that’s technically correct — statistically significant and full of buzzwords — but completely useless to the client. The real value isn’t in picking the fanciest statistical test; it’s in keeping the real-world goal in mind. That shift had a big impact on how I work, because while the details of the math can be fascinating it’s easy to get lost in them and forget the problem you're actually trying to solve.
What’s a recent project or insight you’re particularly proud of?
I recently looked at a dataset and identified what seemed to be a 1% swap rate. Based on very little data because I didn’t have access to the reads - I was only working with statistics - and I was really hesitant to call it a swap because I fear I’m always calling out swaps, so I was looking at it in all different ways before speaking to the client. It’s not a small thing to tell your client they have a sample swap. But eventually I had to say that the data that I was given appeared to have a swap in it. The client, obviously, was terrified and they started to look into it. Eventually we determined that the swap was due to a small number of runs that were used as some sort of QC test and the data were added manually and erroneously and not part of the study. It was good that I found the issue, and that there was actually no real issue. There is a learning here that relates back to the last question - even when you have good evidence you don’t necessarily have the full story until you have the process and history and goals of the data you’re analyzing.
What’s something outside of work that inspires how you think about problem-solving?
I am a rock climber and I have found that rock climbing experience is a pretty good metaphor for many things that I do. I have found connections to work, to glass blowing, to parenting... Working step-by-step, working in a team, looking out for each other, having a safety net. All those things combined is a very distilled way of thinking about problem solving. And also there’s this really nice thing that happens in climbing, that also happens in real life obviously, but is kind of nice - you’re struggling with a certain move or location and it’s the only thing you can think of. But the moment you overcome it, it’s completely gone, you’re over it, it’s vanished and you’re in the next move. You don’t have to think about it anymore. When the whole problem is done you can look back and reminisce but in the moment you don’t have to carry irrelevant problems. It’s true in life but it’s made so clear when you’re climbing that that’s how it is.