- MH370 – Definition of underwater search areas
- Search for the Wreckage of Air France Flight AF 447
- Google papers – 2013
- Stop producing chaos
“Companies often suffer high levels of rework and scrap simply because they do not have a clear understanding of the state of control of their process and their capabilities. This can lead to fragmentation of the quality effort and confusion across the organisation – Engineering is frequently at loggerheads with Manufacturing over the setting of specifications which are too often set without reference to the state of control of processes or their potential ability to meet the specs.”
It is talking about manufacturing process control but it got me wondering about the implications for software development – do you build the “right” system or the one that you are actually capable of building?
Monthly Archives: June 2014
Daily Link
- Ruth Chang: How to make hard choices
- Achieving Rapid Response Times in Large Online Services – an old topic but a goody.
- TSAR, a TimeSeries AggregatoR
- 14 free (as in beer) data mining books
- Architecting a Machine Learning System for Risk
“Most people who’ve worked in machine learning will find this obvious, but it’s worth re-stressing:
If your ground truth is inaccurate, you’ve already set an upper limit to how good your precision and recall can be. If your ground truth is grossly inaccurate, that upper limit is pretty low.” - Engineering a safer world
Daily Link
- Minimal Viable Bureaucracy
- Lock your knees – Habit change is hard. “While the initial trigger (or motivation) is the catalyst that starts the ball rolling, for the change to really manifest into habit forming behavior, you need periodic and regular triggers that keep bringing you back to the specific activity.”
- Ferrari restructuring must allow engineers to be creative. – Winning, like continuous deployment or any other software activity, is a habit and habits must be formed and maintained. “Ferrari has lost the winning habit and he needs to recreate the culture that existed there under Jean Todt.”
- Apache Mesos – Cluster management.
- Erlang 17.1 is out – or via Erlang Solutions.
- Jim Barksdale quotes:
“If it works, it’s a product. If it doesn’t, it’s market research.”
“In the battle between the bear and the alligator, what determines the victor is the terrain.”
- Creating the future
“Everything we see is the result of someone, at some point, wondering ‘Wouldn’t it be nice if…?’ or ‘I wonder if….? and they had the guts and the courage to go for it.”
“We cannot programme our GPS to a destination that doesn’t exist.”In talking to a senior executive at a Fortune 500 company about a promotion to VP that the executive doesn’t want to take because of all that accepting the VP position would require:
Executive: If I say no it will ruin my career
Gerald: But if you say yes it will ruin your life, which is worse? - Site Reliability Engineering
“The solution that we have in SRE — and it’s worked extremely well — is an error budget. An error budget stems from this basic observation: 100% is the wrong reliability target for basically everything. Perhaps a pacemaker is a good exception! But, in general, for any software service or system you can think of, 100% is not the right reliability target because no user can tell the difference between a system being 100% available and, let’s say, 99.999% available. Because typically there are so many other things that sit in between the user and the software service that you’re running that the marginal difference is lost in the noise of everything else that can go wrong.
If 100% is the wrong reliability target for a system, what, then, is the right reliability target for the system? I propose that’s a product question. It’s not a technical question at all. It’s a question of what will the users be happy with, given how much they’re paying, whether it’s direct or indirect, and what their alternatives are.
The business or the product must establish what the availability target is for the system. Once you’ve done that, one minus the availability target is what we call the error budget; if it’s 99.99% available, that means that it’s 0.01% unavailable. Now we are allowed to have .01% unavailability and this is a budget. We can spend it on anything we want, as long as we don’t overspend it”