Skip to main content

Posts

Showing posts from February, 2012

commandline data analytics

Here are some patterns that were important for most  infrastructure functions - infrastructure operations,  software development, or even large scale infrastructure changes, migrations and predicting future states.

The slice:   "| grep | project"
The  map-reduce pipeline:   "| transform | project | sort |uniq -c|sort -nr"
The pivot:   "|project | pivot-and-aggregate"

The output of these patterns are typically fed into the execution pipeline.

These patterns are essential because they direct, focus and scale the primary output beyond what is possible by just being an expert.

There is nothing magical about any of those. But as a considered habit, an involuntary muscle memory, they can bring  enormous leverage and time line compression to any complex piece of work.

There is one pattern that is missing from this: classification/machine learning filter.

We need an  "ml-filter" to  make the future architect/webops engineer's end result to bec…