Member-only story

What is Data Streaming, Exactly?

Eileen Pangu
3 min readDec 23, 2020

--

If my formal computer science education has created any pet peeve for me, that’s the strong dislike of ambiguous definitions. Data streaming is one of those overused and under-defined terms that always gets under my skin. What is data streaming, exactly? Why is it used in ways that are so vague if not contradictory? Do people just want to use it because it sounds cool? Frustrated with its abusive presence, I decided to write a quick blog post to straighten my thoughts. The hope is that at least I’m better prepared when I hear it again next time.

I think the main confusion comes down to the many facets of data streaming. People try to refer to different things when they use data streaming in various contexts. I’ll try to enumerate the situations I’ve heard data streaming mentioned. The keyword is “streaming”. Let’s decipher that.

Streaming could mean incremental. This is in contrast to bulk or batch. Moving/processing data individually one piece at a time vs moving/processing data in big chunks. The line is blurry between individuals and big chunks. Is one piece of data considered individual? Should be. What about ten? Maybe. What about a thousand? Probably not. It also depends on the size and complexity of that piece of data. This definition is very subjective and usually only makes sense in comparison.

Streaming is often used as a synonym to live, or real-time. The emphasis is that it’s happening right now, at the moment. The implication is that the corresponding processing must be fast and responsive…

--

--

Eileen Pangu
Eileen Pangu

Written by Eileen Pangu

Manager and Tech Lead @ FANG. Enthusiastic tech generalist. Enjoy distilling wisdom from experiences. Believe in that learning is a lifelong journey.

No responses yet