Pivotal Engineering Journal
Technical articles from Pivotal engineers.
Feb 1, 2017
Agile Development for Highly Scalable Data Processing Pipelines
Legacy data processing pipelines are slow, inaccurate, hard to debug, and can cause thousands of dollars in revenue. Conforming to agile methodology and a detailed seven-step approach can ensure an efficient, reliable and high-quality data pipeline on distributed data processing framework like Spark. Learn how following TDD, careful creation of data structures, and parallel execution results in: code competency and completeness, and a linearly or constantly scalable robust big data processing pipeline.
Sep 15, 2016
Test-Driven Development for Data Science
Unravelling Test-Driven Development for Data Science.
Apr 27, 2016
Distributed Pair Programming: What Works!
Tales of pair programming on a distributed team.
Jan 15, 2016
Pairing for Data Scientists
Lets see how pair programming fits in the data science world.