Not a bad result for a product that took 3 months of development before its first release on the iPad and has not even been released outside of the UK yet. 2012 is going to be a big year for us!
Thursday, 29 December 2011
Zeebox, ranked number 6 media application above Facebook
The top ten media apps of 2011
Labels:
zeebox
Wednesday, 28 December 2011
Handling Java nulls in Scala
Null handling in Scala can be painful, to ease this pain I like to avoid the problem entirely by not using them at all. However sometimes one has to cross talk between Java and Scala, and then one has to face the problem head on.
The cleanest approach that I have found, that does not result in a deranged mix of generics or implicits is to use an anti corruption layer as close to Java as possible that converts Java's use of null into an Option as soon as possible.
For example:Option(o).map(_.asInstanceOf[T])
If o is null, then the above snippet will evaluate to None, else Some(o).
String Option with map and closures is rarely readable, at least not until you have bathed in it for long enough to have Scala'ized your mind. So I like to wrap this up with a function, as follows.def asInstanceOfNbl[T]( o:AnyRef ) : Option[T] = Option(o).map(_.asInstanceOf[T])
Friday, 5 August 2011
IntelliJ vs Scala [The Battle Rages On: A productivity tip]
Following a short discussion yesterday at work, about how Intellij keeps changing scala keywords to Java class name while you are typing. Forcing you to break flow, go back and correct it (swear) and then continue. I dug around the IntelliJ settings and found the following very useful setting. In Java it works wonders, in Scala it is just plain irksome.
Settings->Editor->Code Completion->Preselect the first option
When writing Scala, set it to 'Never'. The Scala keywords val and override will never be quite as annoying again ;)
Sunday, 12 June 2011
Java 8 wishlist
I have blogged before about my dissatisfaction with the rate of improvement of the Java platform. While the situation is understandable, it still leaves me annoyed. Below is an updated wish list that I would like to see in Java 8.
I am increasingly considering getting my hands dirty in adding some of these features. If only I had more time. So my first wish is for a clock with a big 'stop the world' button on it. Having an infinite amount of time, with high energy levels, no risk of rsi and no risk of interruptions would be great. Please create this for me Mr S. Hawking.
jvm
support jar level constant pools to reduce class sizes
enhanced gc
transactional memory support
add method parameter names to reflection
access to javaadoc from api (provide in optional packaging that can be accessed via reflection); include comment following abstract method.
Simple syntax sugar
.?
multi line strings
versioned jar file/module system
drop the annoying diamond operator (<>) added in Java 7 but keep the functionality
optional method parameters with default values
language syntax for reflection
language support for collections
pre condition support on method signatures
More involved language features
mix ins
closures
Language support for custom types
replace try-close with keyword; keyword declares that an object is not to escape the stack
Inferred type support (eg var s = "abc"; s is strongly typed to String why double declare it)
I am increasingly considering getting my hands dirty in adding some of these features. If only I had more time. So my first wish is for a clock with a big 'stop the world' button on it. Having an infinite amount of time, with high energy levels, no risk of rsi and no risk of interruptions would be great. Please create this for me Mr S. Hawking.
jvm
support jar level constant pools to reduce class sizes
enhanced gc
transactional memory support
add method parameter names to reflection
access to javaadoc from api (provide in optional packaging that can be accessed via reflection); include comment following abstract method.
Simple syntax sugar
.?
multi line strings
versioned jar file/module system
drop the annoying diamond operator (<>) added in Java 7 but keep the functionality
optional method parameters with default values
language syntax for reflection
language support for collections
pre condition support on method signatures
More involved language features
mix ins
closures
Language support for custom types
replace try-close with keyword; keyword declares that an object is not to escape the stack
Inferred type support (eg var s = "abc"; s is strongly typed to String why double declare it)
Labels:
java,
language design,
musing
Friday, 3 June 2011
Slow networks
Interesting points from http://highscalability.com/blog/2011/6/1/why-is-your-network-so-slow-your-switch-should-tell-you.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HighScalability+%28High+Scalability%29
Network operators get calls from application guys saying the network is slow, but the problem is usually dropped packets due to congestion. It's not usually latency, it's usually packet loss. Packet loss causes TCP to back off and retransmit, which causes applications to appear slow.
Packet loss can be caused by a flakey transceiver, but the problem is usually network congestion. Somewhere on the network there's fan-in, a bottleneck develops, queues build up to a certain point, and when a queue overflows it drops packets. Often the first sign of this happening is application slowness.
Saturday, 21 May 2011
Which project metrics do you use?
My favourite software project metrics:
The main two that I use at a high level are end to end cycle time, and benefit hit rate.
The main two that I use at a high level are end to end cycle time, and benefit hit rate.
Business Benefit Metrics
- End to End Cycle Time
- the length of time from the conceptualisation of an idea, or feature to the reliable use of that feature by the end customers.
- ROI cycle time
- The length of time from conceptualisation of a feature to when the feature pays for itself, expressed as a rolling average
- Benefit Hit Rate
- As a percentage, how many of the features that start being developed go on to return more to the business than they cost.
- Overhead
- Number of man days not spent producing the product, ie waiting for a build, waiting for a release, merging branches, waiting for broken builds, waiting for test runs to complete, waiting for hardware, attending meetings etc
Developer Productivity Metrics
- Commit cycle time
- Length of time that it takes a developer from completing a feature to receiving confirmation that it has been accepted.
- Defect rate
- Number of defects per feature that was accepted as complete, expressed as a rolling average or over a period of time.
- Build time
- Length of time that it takes to run the build
- Commit test execution time
- How long it takes to run the commit tests
- Acceptance test execution time
- How long it takes to run the acceptance tests
- Automated test quality
- When a test fails, is the fix to the system code or to the test? Represented as a percentage: num fixes to system code/num failed tests
User Experience Metrics
- Use case duration
- How long does it take an average user to perform a single goal
- Use case step count
- The number of user actions required to attain their goal
- User frustration
- Number of user mistakes + number of system errors*10
- UI Responsiveness
- Length of time from a user click and a system reponse
Live System Metrics
- Uptime
- The length of that the system has been up without a user visible outage
- Call out count
- Number of times that a 'fatal' error has occurred requiring immediate support staff to take action, even if in the middle of the night
- Error count
- The number of errors reported in the log files
- Warning count
- The number of warning messages reported in the log files
- Support overhead
- Number of man hours required to keep the lights on
- Release duration
- How long does it take to perform a release, in man hours
Code Metrics
- Unit test coverage
- Percentage of lines of code tested by the unit tests
- Acceptance test coverage
- Percentage of lines of code tested by the acceptance tests
- Average class Cyclometric Complexity
- Indicator of code complexity, the higher the value the more complex the code base is to understand
- Worst method Cyclometric Complexity
- Indicator of code complexity, the higher the value the more complex the code base is to understand
Labels:
agile,
project management
Refactoring is not the same as rearchitecting
Refactoring is a small scale technique, a collection of small transformations that when applied with skill to a code base helps to improve the readability of the code without changing its behaviour. Unfortunately it has become all to common to use the term with management to hide a project rearchitecturing exercise or rewrite. Such work may be required, however the purpose of refactoring is as a low cost tool to use daily as part of ones programming practice; which helps avoid entropy in the first place.
For more information see: http://refactoring.com/
Labels:
best practices,
musing
High Cyclic complexity (say over 80) is like handing in the brain dump of an essay as the final essay.
Labels:
musing
Sunday, 15 May 2011
Software is not built. It is written.
This post compares the work of traditional engineers with that of software engineers, understanding where they are similar and where they are different helps me to understand the software development process. This post is a treatise on how to make waterfall work, and why Agile is here to stay.
It is a large step to claim that a lot of projects release drafts of software, however it is common practice for the first design to a solution to be implemented and for the first implementation to be passed to testers; there is then often strong resistance to revise the solution to improve its behaviour and maintainability. The refactoring movement and the notion of code debt has helped to raise awareness. However it is still common for there to be a mismatch between senior management and developers in what state they believe the software product to be in. How can it be so common for smart people to release first drafts of a design without realising that it is the first draft? How effectively can such a mistake be managed if the senior management are not aware of the situation? In my experience the root cause of this mistake has often been a misunderstanding of what a Software Engineer actually does and thus as a consequence a misunderstanding in what they need in order to effectively coordinate them to deliver a successful project now AND to position said project ready for the success of the next project.
I wonder whether part of the problem, part of the reason why some projects mistake a first draft as the final product is the overuse of the 'building' metaphor that is commonly used to describe software development. A weakness in this metaphor that software is built, is the comparison of the skills needed to manage a labourer involved in a task that is approximately deterministic and that of a non-deterministic task. Building sites, factories, any kind of manufacturing or building process is designed around managing tasks that are, to within some tolerance, deterministic. Consider a skilled builder who has perfected laying one brick after another with a small enough error rate that quality assurance is enough to ensure that the house will not fall down. Does it follow that the same process can be used for the author of a book? Or an Engineer designing a bridge? Ask two Engineers to design a bridge, given the same spec and you will have two different bridges. Ask two brick layers to build two walls to a spec, and you will have approximately two identical walls. Both skill sets call upon different management skills to manage; which of these examples is most like software development? A misunderstanding here will result in the wrong management tools being brought to bare.
Is the act of writing software deterministic, or is it non-deterministic? Ask two programmers to implement a timesheet program, given the same spec I guarantee to you that you will get two very different programs that will be different in just about every way that one can think to measure them. And I am not talking about the small tolerance introduced by the nature of the human hand, as in the brick laying example. I am talking about fundamental differences closer to the example of two different bridges described earlier.
In order to make this difference clearer lets take a look at the core of what a Software Engineer does, without using metaphors the very least that every programmer has to do is to write in a programming language (Java, C#, JavaScript, C, SQL etc) instructions to a computer telling it what to do, how and when. Knowledge of what instructions to give to the computer is a problem solving, design and modelling task, mapping that to one or more written languages is a translation and communication task.
So where is the building metaphor in this description of programming? In the 1980's there was every attempt to bring Taylorism to software development, in the same way that Taylorism removed the non-deterministic nature inherit in the craftsmanship of trades like wood and metal working on production lines, people attempted to remove it from programming too. Or more precisely the attempt was to move the non-deterministic nature of the crafts from the fabrication stage and to move it to the design stage. Thus allowing different skills, risk processes and management techniques to be brought to bare. This is the basis of the 'Big Upfront Design' project. Big Upfront Design (BUD) endeavours to raise the confidence of when a new software product will ship, and to give the clients signing the check more confidence on what they are going to receive. The rationale is that this confidence can be afforded because the uncertainties will have been ironed out during the design stage and the build stage can be reduced to an almost mechanical process.
If you have worked for a company that has employed the BUD approach, then you will have probably also witnessed a great amount of frustration as the 'build' stage never goes as smoothly as planned. The reason for this is that the design stage was not followed rigorously enough to remove all uncertainty from the 'build' stage. The rest of this post explores why the goal of separating all design from the build stage, as has been done in other engineering disciplines will never work for software development. And a BUD effort that is not aware of the difference, and expects to gain the benefits as seen in traditional engineering disciplines will, on complex projects fail before the project has even started.
In more traditional engineering the cost of manufacturing is extremely high, just the tooling costs alone can be significant; think bridges, cars, circuit boards and houses. The task of the traditional engineer is to design a physical solution to a problem and to then communicate how to build the thing that they have designed so a separate group of people can take those designs and fabricate them. The fabrication stage is very expensive, so a great amount of rigour is used to ensure that the plans are as accurate, expressive and correct as possible before building begins. The expectation of many software BUD efforts is to reach this level of maturity, but in order to reach this level engineering companies have to spend a lot of time and money in the design and verification of design stages before moving on to fabrication. This makes a lot of sense for engineering companies, the fabrication/build stage is extremely expensive. In fact just tooling up for production can be mega money, so they need to know that they are buying the right tools and that by following the plans to the letter that they will always build the same product. QA is then needed to control the last statistical variance of production seen in the real world. In order for BUD software projects to gain the same level of confidence as other engineering efforts, then they too must spend extra time verifying their designs with the last stage of verification is building the end design at least once before being confident to commit to tooling up for the mass production of the design. However how many software projects take verification to that extreme before committing to a build?
The idea of not committing to the build stage of a software product, until the designs have been verified by building it at least once probably sounds mad. It may be common when designing an aircraft, but software is not an aircraft. And I agree with you. Software products have a significant advantage over mechanical products, software products are extremely cheap to duplicate. There is no need to tool up for mass production within a software project in the same way as physical products. Mass production is not cheap and the great news is that software does not need it. This is a significant advantage for software projects. The reason for traditional engineers needing to use mathematical simulations, models, scaled down builds etc before committing to building the first prototype is that prototypes cost a lot of money and take a long time to build. The more expensive it is to build and test prototypes, the more rigorous the engineering disciplines become in order to verify their designs before committing to any kind of build. Software Engineers can take their designs and build the application within minutes at the cost of a few minutes electricity to drive the computers. Thats it. Reread this paragraph, at its heart is the key to understanding both the engineering side of waterfall and agile methodologies. Now reread it again and think about it.
A goal of a BUD in engineering is to make the construction stage predictable, control costs and reduce risks by knowing the outcome of the construction stage before starting. The big benefit here is for mass production. Taking the time to design everything upfront is expensive and time consuming, and because it is design, the final solution is not clear at the start of the process. The expense of BUD is offset against the savings created by reducing the costs of mass production. On a software project the mass production phase is as simple as a file copy, or a compact disk copy or a download off of the internet or the Apple store. As there are no mass production costs with software, where is the savings with which to offset a BUD effort? The cost of the first build? The first prototype? There is no build savings with which to offset the cost of BUD. No mass production stage means that software is often viewed as done when the first prototype is completed1.
If a software architect could hand a developer a specification so complete that the programmer would be able to produce exactly what was requested, in the same way that a factory builds a part designed by an engineer then there would be no need for the programmer. Because the specification would be so complete, so unambiguous that a computer could follow the plans itself and build the application. This fact is worth repeating, a software architect cannot fully define the specification of a system to the point that a developer can fully build it with no ambiguity without them implementing the full solution. The software languages used by software experts is the specification language used by the software industry to describe exactly how to build an application. Software experts then use computers to perform the 'build' for us, they read the formal specifications written in languages like C and Java and they compile, verify and link them into machine code. Modern factories are working to fully automate their production lines, the software industry has had that since the fifties.
Realising that software has no build stage, in the traditional sense is extremely empowering. It explains why the industry efforts with Model Driven Design stalled, why the Agile movement gained traction in the nineties and while Agile was first considered only a fad, it explains why agile is still with us in 2011. It explains why the biggest cost savings in the software industry has been realised by improving the expressiveness of the programming languages (the specification languages) and it explains why companies that respond to software project problems (client dissatisfaction, lack of trust, failed projects etc) by strengthening their BUD efforts while at the same time being unprepared for the costs and lengthening in feedback cycles that come with it, will not experience the increase in certainty that they hope for until they have written the solution at least once2. That all important first prototype. This lesson is especially difficult to learn on BUD efforts, as the feedback cycles are so long on such projects. The failures can take a year or more before they are recognised and the first thought is often that the execution of the process was wrong. The blame game begins. The developers did not follow the plan, the developers were not good enough, the tests were sloppy or the architects and management were incompetent. The mindset then becomes that tightening up the BUD process further will solve those problems. They will not, that is not where the problem lies. Nobody was incompetent; we as an industry have been focusing on emulating other engineering disciplines without acknowledging the ways that we are different.
To make waterfall work, a company must be prepared to fully plan a solution to the point where a computer can execute it, they must know where the design of the solution really starts and ends, and accept that the nature of design is to not know what the final outcome will be. It would not need designing if we knew. Once the first prototype has been created then the company can verify it, and then decide their next steps with increased certainty; and not before. The risk is that with a long design phase, there are long periods with no feedback. If a project is cancelled, unless the first draft was completed then it is unlikely that there will be any benefit to gain from the cancelled project. Therefore to reduce project risk, reduce the feedback cycle; do not lengthen it.
Software, unlike materials in the real world is cheap to build and to rework. The cost is in the design, writing and verification phases, and that is the process that needs to be managed. Because of the nature of software, the craftsman who write the programs are part of the design phase and attempts to separate the design and the 'build' phases did not succeed in the eighties and they will never succeed because the build phase in software development has already been automated by relatively cheap machines.
The biggest savings in software development are realised by reducing exposure to risk by reducing feedback cycles, effectively managing labour during the design process, managing the information gained from the shorter feedback cycles, reducing labour costs by automating as much of the non-design work that occurs in order to support the main design work and improvements to the languages used to describe the plans to computers.
1 Given the analogy between releasing the first draft of a book and the first draft of a program it is worth taking a moment to compare how software engineering is and is not like writing a book. A book is usually written by one or two people, in a single language for humans to read, enjoy and learn from and after the publish date it is difficult to change again; it is difficult to see comparisons with engineering here as books written by authors are rarely rigorous instructions on how to build anything. In contrast the source code to a software system is nearly always written in multiple programming languages, by many programmers for computers and other programmers to understand and follow as literal, logical steps; instructions on how to attain a desired effect. And after release software enters a maintenance phase which can last many years and involves many changes by many more people often after the original programmers have left.
And so Software Engineering is no more like writing a novel than it is like building a house, however it is as similar to a Mechanical Engineer writing instructions on how to build their design. A Software Engineer writes instructions on how to generate a desired effect that they have designed. It is this similarity of writing, and the need for clear communication that the opening comparison of this post is designed to draw attention to. Software Engineering is as similar to writing a book as Mechanical Engineering is; and Mechanical Engineering is more like building a house than Software Engineering is. So for me, this calls into question the very premise that software is built and it is that premise and the consequences of it that are the topic of this post.
2 As a side note, once a company has paid with blood to attain the first working prototype the team that created it often disbands. All that knowledge of the problem, design and solution spaces are gone and any team that follows them have to backwards engineer the original intent or just hack it. This is how many companies with frustrating projects literally haemorrhage money.
"An author would not expect the first draft of their book to be published. Few if any publishers would accept the first draft of a book. So why then, is it so common for software projects to release first drafts of their programs?"
Summary of the problem
When first drafts of Software is released intentionally then the company making this decision will be more aware of its consequences, and so in a better position to manage the consequences than a company that is unaware that the product that they are releasing is effectively the first prototype of an engineering design. Usually the consequences of this will include the costs of stabilising the product and long delays, sometimes creating a bottomless pit of expense just to keep the system turned on; and it only follows that the cost of adding more features to such a system will be equally large and over time they will increase.It is a large step to claim that a lot of projects release drafts of software, however it is common practice for the first design to a solution to be implemented and for the first implementation to be passed to testers; there is then often strong resistance to revise the solution to improve its behaviour and maintainability. The refactoring movement and the notion of code debt has helped to raise awareness. However it is still common for there to be a mismatch between senior management and developers in what state they believe the software product to be in. How can it be so common for smart people to release first drafts of a design without realising that it is the first draft? How effectively can such a mistake be managed if the senior management are not aware of the situation? In my experience the root cause of this mistake has often been a misunderstanding of what a Software Engineer actually does and thus as a consequence a misunderstanding in what they need in order to effectively coordinate them to deliver a successful project now AND to position said project ready for the success of the next project.
The building metaphor
I wonder whether part of the problem, part of the reason why some projects mistake a first draft as the final product is the overuse of the 'building' metaphor that is commonly used to describe software development. A weakness in this metaphor that software is built, is the comparison of the skills needed to manage a labourer involved in a task that is approximately deterministic and that of a non-deterministic task. Building sites, factories, any kind of manufacturing or building process is designed around managing tasks that are, to within some tolerance, deterministic. Consider a skilled builder who has perfected laying one brick after another with a small enough error rate that quality assurance is enough to ensure that the house will not fall down. Does it follow that the same process can be used for the author of a book? Or an Engineer designing a bridge? Ask two Engineers to design a bridge, given the same spec and you will have two different bridges. Ask two brick layers to build two walls to a spec, and you will have approximately two identical walls. Both skill sets call upon different management skills to manage; which of these examples is most like software development? A misunderstanding here will result in the wrong management tools being brought to bare.
Is the act of writing software deterministic, or is it non-deterministic? Ask two programmers to implement a timesheet program, given the same spec I guarantee to you that you will get two very different programs that will be different in just about every way that one can think to measure them. And I am not talking about the small tolerance introduced by the nature of the human hand, as in the brick laying example. I am talking about fundamental differences closer to the example of two different bridges described earlier.
In order to make this difference clearer lets take a look at the core of what a Software Engineer does, without using metaphors the very least that every programmer has to do is to write in a programming language (Java, C#, JavaScript, C, SQL etc) instructions to a computer telling it what to do, how and when. Knowledge of what instructions to give to the computer is a problem solving, design and modelling task, mapping that to one or more written languages is a translation and communication task.
The influence of Taylorism on software projects
So where is the building metaphor in this description of programming? In the 1980's there was every attempt to bring Taylorism to software development, in the same way that Taylorism removed the non-deterministic nature inherit in the craftsmanship of trades like wood and metal working on production lines, people attempted to remove it from programming too. Or more precisely the attempt was to move the non-deterministic nature of the crafts from the fabrication stage and to move it to the design stage. Thus allowing different skills, risk processes and management techniques to be brought to bare. This is the basis of the 'Big Upfront Design' project. Big Upfront Design (BUD) endeavours to raise the confidence of when a new software product will ship, and to give the clients signing the check more confidence on what they are going to receive. The rationale is that this confidence can be afforded because the uncertainties will have been ironed out during the design stage and the build stage can be reduced to an almost mechanical process.
If you have worked for a company that has employed the BUD approach, then you will have probably also witnessed a great amount of frustration as the 'build' stage never goes as smoothly as planned. The reason for this is that the design stage was not followed rigorously enough to remove all uncertainty from the 'build' stage. The rest of this post explores why the goal of separating all design from the build stage, as has been done in other engineering disciplines will never work for software development. And a BUD effort that is not aware of the difference, and expects to gain the benefits as seen in traditional engineering disciplines will, on complex projects fail before the project has even started.
Where Software Engineers have an advantage
In more traditional engineering the cost of manufacturing is extremely high, just the tooling costs alone can be significant; think bridges, cars, circuit boards and houses. The task of the traditional engineer is to design a physical solution to a problem and to then communicate how to build the thing that they have designed so a separate group of people can take those designs and fabricate them. The fabrication stage is very expensive, so a great amount of rigour is used to ensure that the plans are as accurate, expressive and correct as possible before building begins. The expectation of many software BUD efforts is to reach this level of maturity, but in order to reach this level engineering companies have to spend a lot of time and money in the design and verification of design stages before moving on to fabrication. This makes a lot of sense for engineering companies, the fabrication/build stage is extremely expensive. In fact just tooling up for production can be mega money, so they need to know that they are buying the right tools and that by following the plans to the letter that they will always build the same product. QA is then needed to control the last statistical variance of production seen in the real world. In order for BUD software projects to gain the same level of confidence as other engineering efforts, then they too must spend extra time verifying their designs with the last stage of verification is building the end design at least once before being confident to commit to tooling up for the mass production of the design. However how many software projects take verification to that extreme before committing to a build?
The idea of not committing to the build stage of a software product, until the designs have been verified by building it at least once probably sounds mad. It may be common when designing an aircraft, but software is not an aircraft. And I agree with you. Software products have a significant advantage over mechanical products, software products are extremely cheap to duplicate. There is no need to tool up for mass production within a software project in the same way as physical products. Mass production is not cheap and the great news is that software does not need it. This is a significant advantage for software projects. The reason for traditional engineers needing to use mathematical simulations, models, scaled down builds etc before committing to building the first prototype is that prototypes cost a lot of money and take a long time to build. The more expensive it is to build and test prototypes, the more rigorous the engineering disciplines become in order to verify their designs before committing to any kind of build. Software Engineers can take their designs and build the application within minutes at the cost of a few minutes electricity to drive the computers. Thats it. Reread this paragraph, at its heart is the key to understanding both the engineering side of waterfall and agile methodologies. Now reread it again and think about it.
The misunderstanding
A goal of a BUD in engineering is to make the construction stage predictable, control costs and reduce risks by knowing the outcome of the construction stage before starting. The big benefit here is for mass production. Taking the time to design everything upfront is expensive and time consuming, and because it is design, the final solution is not clear at the start of the process. The expense of BUD is offset against the savings created by reducing the costs of mass production. On a software project the mass production phase is as simple as a file copy, or a compact disk copy or a download off of the internet or the Apple store. As there are no mass production costs with software, where is the savings with which to offset a BUD effort? The cost of the first build? The first prototype? There is no build savings with which to offset the cost of BUD. No mass production stage means that software is often viewed as done when the first prototype is completed1.
Computer languages as the specification language
If a software architect could hand a developer a specification so complete that the programmer would be able to produce exactly what was requested, in the same way that a factory builds a part designed by an engineer then there would be no need for the programmer. Because the specification would be so complete, so unambiguous that a computer could follow the plans itself and build the application. This fact is worth repeating, a software architect cannot fully define the specification of a system to the point that a developer can fully build it with no ambiguity without them implementing the full solution. The software languages used by software experts is the specification language used by the software industry to describe exactly how to build an application. Software experts then use computers to perform the 'build' for us, they read the formal specifications written in languages like C and Java and they compile, verify and link them into machine code. Modern factories are working to fully automate their production lines, the software industry has had that since the fifties.
Realising that software has no build stage, in the traditional sense is extremely empowering. It explains why the industry efforts with Model Driven Design stalled, why the Agile movement gained traction in the nineties and while Agile was first considered only a fad, it explains why agile is still with us in 2011. It explains why the biggest cost savings in the software industry has been realised by improving the expressiveness of the programming languages (the specification languages) and it explains why companies that respond to software project problems (client dissatisfaction, lack of trust, failed projects etc) by strengthening their BUD efforts while at the same time being unprepared for the costs and lengthening in feedback cycles that come with it, will not experience the increase in certainty that they hope for until they have written the solution at least once2. That all important first prototype. This lesson is especially difficult to learn on BUD efforts, as the feedback cycles are so long on such projects. The failures can take a year or more before they are recognised and the first thought is often that the execution of the process was wrong. The blame game begins. The developers did not follow the plan, the developers were not good enough, the tests were sloppy or the architects and management were incompetent. The mindset then becomes that tightening up the BUD process further will solve those problems. They will not, that is not where the problem lies. Nobody was incompetent; we as an industry have been focusing on emulating other engineering disciplines without acknowledging the ways that we are different.
To make waterfall work, a company must be prepared to fully plan a solution to the point where a computer can execute it, they must know where the design of the solution really starts and ends, and accept that the nature of design is to not know what the final outcome will be. It would not need designing if we knew. Once the first prototype has been created then the company can verify it, and then decide their next steps with increased certainty; and not before. The risk is that with a long design phase, there are long periods with no feedback. If a project is cancelled, unless the first draft was completed then it is unlikely that there will be any benefit to gain from the cancelled project. Therefore to reduce project risk, reduce the feedback cycle; do not lengthen it.
In conclusion
Software, unlike materials in the real world is cheap to build and to rework. The cost is in the design, writing and verification phases, and that is the process that needs to be managed. Because of the nature of software, the craftsman who write the programs are part of the design phase and attempts to separate the design and the 'build' phases did not succeed in the eighties and they will never succeed because the build phase in software development has already been automated by relatively cheap machines.
The biggest savings in software development are realised by reducing exposure to risk by reducing feedback cycles, effectively managing labour during the design process, managing the information gained from the shorter feedback cycles, reducing labour costs by automating as much of the non-design work that occurs in order to support the main design work and improvements to the languages used to describe the plans to computers.
1 Given the analogy between releasing the first draft of a book and the first draft of a program it is worth taking a moment to compare how software engineering is and is not like writing a book. A book is usually written by one or two people, in a single language for humans to read, enjoy and learn from and after the publish date it is difficult to change again; it is difficult to see comparisons with engineering here as books written by authors are rarely rigorous instructions on how to build anything. In contrast the source code to a software system is nearly always written in multiple programming languages, by many programmers for computers and other programmers to understand and follow as literal, logical steps; instructions on how to attain a desired effect. And after release software enters a maintenance phase which can last many years and involves many changes by many more people often after the original programmers have left.
And so Software Engineering is no more like writing a novel than it is like building a house, however it is as similar to a Mechanical Engineer writing instructions on how to build their design. A Software Engineer writes instructions on how to generate a desired effect that they have designed. It is this similarity of writing, and the need for clear communication that the opening comparison of this post is designed to draw attention to. Software Engineering is as similar to writing a book as Mechanical Engineering is; and Mechanical Engineering is more like building a house than Software Engineering is. So for me, this calls into question the very premise that software is built and it is that premise and the consequences of it that are the topic of this post.
2 As a side note, once a company has paid with blood to attain the first working prototype the team that created it often disbands. All that knowledge of the problem, design and solution spaces are gone and any team that follows them have to backwards engineer the original intent or just hack it. This is how many companies with frustrating projects literally haemorrhage money.
Labels:
agile,
engineering,
project management,
waterfall
Thursday, 12 May 2011
So what is Software Development? Art, Craft, Science or Engineering?
What assumptions, preconceptions or heuristics help you as a Software Developer? Both now, and those that guide your learning and exploration of new techniques?
Like many practitioners of my age I have flirted with SWD (Software Development) as an Art form, a Craft, a Science and Engineering practice many times over the years. I was trained at a University as a Software Engineer, by Engineers with a background in either Electrical Engineering, Mathematics, Physics or Computer Science. I have worked with developers who considered themselves to be solely artists and/or craftsman, and I have worked with scientists.
But why care? Does it matter which field Software Development is most like? Each time that I have considered SWD as just one of those forms, I took inspiration from those fields and I found that there was elements of my work that were strengthened, however I have also found that as a whole my work was weakened. At first I did not notice that the whole was weakened, at first I did not have enough breadth of experience to see the effect. Or perhaps as I was learning in the early days I lacked enough skill for there to even be a decline to observe.
However once the observation that my work was weakened as a whole was made, it has lead me to not be so interested in whether SWD is any one discipline, other than SWD is SWD. SWD is a young field dating back only five decades or so; this is young when compared to other fields that date back hundreds or thousands of years. So instead I postulate that SWD has the most to learn by infusing elements from each of these sister fields.
The view of SWD as an infusion of trades is a pragmatic one, I am not sure if it is at the essence of SWD or an acceptance of its young age. Before Science was viewed a Science its practitioners were as much Artist, Craftsman and Engineer as they were Scientist. Perhaps Artistry and Craftsmanship can be viewed as a base that Engineering, and Science are built upon.
As of the time of writing this post I find the guide that I am currently using the most is to view Artistry and Craftsmanship as a base for SWD and Engineering as an extension that helps with the coordination of larger teams of developers.
Put another way, Artistry reminds me that working without putting ones soul into it is work that is not worth pursuing, Graphic Design aids me with understanding how to improve the communication of my software designs both to the end client AND to other developers who follow after me by improving the communication of the code base itself. Craftsmanship keeps me practical and learning by doing (rather than being overly theoretical, which is a tendency of mine) and Engineering keeps the rigour in the process that helps to coordinate cross skilled teams effectively.
Like many practitioners of my age I have flirted with SWD (Software Development) as an Art form, a Craft, a Science and Engineering practice many times over the years. I was trained at a University as a Software Engineer, by Engineers with a background in either Electrical Engineering, Mathematics, Physics or Computer Science. I have worked with developers who considered themselves to be solely artists and/or craftsman, and I have worked with scientists.
But why care? Does it matter which field Software Development is most like? Each time that I have considered SWD as just one of those forms, I took inspiration from those fields and I found that there was elements of my work that were strengthened, however I have also found that as a whole my work was weakened. At first I did not notice that the whole was weakened, at first I did not have enough breadth of experience to see the effect. Or perhaps as I was learning in the early days I lacked enough skill for there to even be a decline to observe.
However once the observation that my work was weakened as a whole was made, it has lead me to not be so interested in whether SWD is any one discipline, other than SWD is SWD. SWD is a young field dating back only five decades or so; this is young when compared to other fields that date back hundreds or thousands of years. So instead I postulate that SWD has the most to learn by infusing elements from each of these sister fields.
The view of SWD as an infusion of trades is a pragmatic one, I am not sure if it is at the essence of SWD or an acceptance of its young age. Before Science was viewed a Science its practitioners were as much Artist, Craftsman and Engineer as they were Scientist. Perhaps Artistry and Craftsmanship can be viewed as a base that Engineering, and Science are built upon.
As of the time of writing this post I find the guide that I am currently using the most is to view Artistry and Craftsmanship as a base for SWD and Engineering as an extension that helps with the coordination of larger teams of developers.
Put another way, Artistry reminds me that working without putting ones soul into it is work that is not worth pursuing, Graphic Design aids me with understanding how to improve the communication of my software designs both to the end client AND to other developers who follow after me by improving the communication of the code base itself. Craftsmanship keeps me practical and learning by doing (rather than being overly theoretical, which is a tendency of mine) and Engineering keeps the rigour in the process that helps to coordinate cross skilled teams effectively.
Labels:
musing
Saturday, 7 May 2011
Test tip with Maven Surefire
When starting a new Maven module, I strongly encourage everybody to set the unit tests to execute in parallel.
There are a few benefits to this:
There are a few benefits to this:
- most obviously the unit tests run faster on modern multi-cpu boxes
- it helps to tease out concurrent bugs earlier
- it helps to enforce good unit testing practices; specifically that there should be no side effects between tests and the order of test execution should not matter
- enabling this feature later on in a project is possible, but it is nearly always a pain so start it early and you will get its benefits almost for free
I recently introduced this practice to a new team that I am working with and upon enabling it on a recent module whose unit tests were already in a good state it highlighted a concurrency bug in a third party library that was being pulled in that had not yet been detected. Who would have guessed it? Early detection of problems is awesome.
To enable in Maven just add the following runes to you pom.xml file:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<parallel>methods</parallel>
<threadCount>2</threadCount>
</configuration>
</plugin>
Labels:
best practices,
java,
maven,
testing tips
Java Wish List
When I was young (oh man I sound old just for saying that), when I was younger (eeee not any better; oh dear). Back when I used to write compilers, and hack the Linux kernel I never thought twice about adding a new language feature to make my job easier. Over the years the amount of effort that requires, usually in tooling of various IDEs and the lack of will for companies that I work for to develop such tools I have fallen out of the habit of asking what language feature would make this job easier (it has instead been replaced with ‘what framework feature or library will make this job easier’).
So as I become increasingly itchy with Java I have found myself once again asking these questions, “what language features do I want?”. I was surprised at how long this list became. I include short one descriptions of the main features that came to mind, I will look to expand on some of them in future posts as short descriptions do not do them justice.
- Improved GC behaviour where performance does not drop off for big heaps.
- Software/Hardware Tranactional Memory (STM/HTM) support built into the language, ala Clojure.
- Simplify the try-with-resource syntax. Something like the following should suffice, using the usual curly braces and the stack for scope.
- autoclose InputStream in = new FileInputStream("a.txt");
- Type inference Scala style. This one feature in Scala, in my opinion makes Scala the language to look at these days.
- A JVM enhancement, support constant pools across an entire jar file and not just a single class. It would reduce the file size greatly.
- To give extra design intent information to the compiler which would give extra compile time checking, allow the developer to declare that a object is to not leave the stack. References that escape the stack can cause thread safety problems and so receiving compile time checks when it matters would be a big help.
- local Context ctx = new Context();
- Compiler to error when references to this escape a constructor during instantiation. This is a common cause of concurrency bugs and so should just not be allowed.
- Closures closures closures, functions as objects, everything as objects, objects objects objects. Closures. – steps down off of his soap box –
- Remove statics. Naturally not achievable for Java, but like Global variables they break many of todays best practice design patterns and they should be put out of their missary. Another good point of Scala .
- Build immutable and lockable objects into the language syntax
- Support lazy evaluation of expressions passed as arguments to methods.
- Build object field and method reflection into the syntax of the language.
- Clean and elegant closure support. Okay easier said than done, I’ll expand my ten pence worth on how this should look later but as a teaser I don’t like the Groovy or Scala syntax and prefer the style of using anonymous method syntax. It just feels more uniform to me.
- Reflection can retrieve method parameter names, and even java doc
- Refreash the internationalisation frameworks and build it more cleanly into the language. I like how Apple have done it in ObjectiveC over the Java resource approach.
- language eval support
- simple dynamic sql syntax, Nick Reeves taught me a wonderful syntax for this a decade ago and I would love to see it become main stream.
- Another theft from Scala, implicit type conversions and operator overloading (okay so thats C++ style, we all borrow from each other . So I can write a Radian class and pass it around and get compile time type checking and use of language operators and conversions etc
- for ( T a : list ) should not throw NPE when list is null
- Groovy had it right when they added the .? operator for avoiding NPE exceptions when calling a.b().c()
- Default values for method parameters (geesh is Java the only language without this)
- Multiple line strings
- Language support for templates
- Language support for SQL and XML
- Add Interner support
- Add cache framework api built up using the delegate design pattern
- one keyword property declaration
- cleaner support for hashcode, equals and tostring..
Labels:
java,
language design
Wednesday, 30 March 2011
Java 7 musings
With great anticipation I downloaded the new Java 7 preview, like a kid with a new toy I unwrapped it in a hurry and wanted to play. It was not long before the sense of disappointment set in. As a professional Java developer I will be happy to move from Java 6 to Java 7, as a passionate Software Engineer I am deeply disappointed and I find that my mind is increasingly leaving the Java sphere (even if my pay check is not). New languages such as Haskell, and Scala are really pushing for my attention.
There are several very nice new features in Java 7, and a couple little tidy ups that will have a big impact on my day to day development such as not having to duplicate generic declarations everywhere. The new NIO features and concurrency enhancements are great however closures are once again out, and the try with resources enhancement promised to be oh so nice looks ugly! (and yes, aesthetics count! they effect readability and maintainability of code.. ie the final cost of the product).
Java is a fantastic language, however it now has so much weight on its shoulders that it changes slowly and the pressure for backwards compliance (which they have done a fantastic job with) is resulting in syntax that is messy and painful to read. Specifically I refer to Generics, the new try with resources syntax and most of the closure proposals. It is with little wonder that we are now seeing a wealth of languages pop up on the JVM, they are now where the new language features and experiments are occurring (i.e. Scala, Groovy, Closure and Ceylon). As Java becomes increasingly adopted as the language of choice for business, so its rate of innovation is dropping.
There are several very nice new features in Java 7, and a couple little tidy ups that will have a big impact on my day to day development such as not having to duplicate generic declarations everywhere. The new NIO features and concurrency enhancements are great however closures are once again out, and the try with resources enhancement promised to be oh so nice looks ugly! (and yes, aesthetics count! they effect readability and maintainability of code.. ie the final cost of the product).
Java is a fantastic language, however it now has so much weight on its shoulders that it changes slowly and the pressure for backwards compliance (which they have done a fantastic job with) is resulting in syntax that is messy and painful to read. Specifically I refer to Generics, the new try with resources syntax and most of the closure proposals. It is with little wonder that we are now seeing a wealth of languages pop up on the JVM, they are now where the new language features and experiments are occurring (i.e. Scala, Groovy, Closure and Ceylon). As Java becomes increasingly adopted as the language of choice for business, so its rate of innovation is dropping.
Labels:
language design
Monday, 31 January 2011
Advice on learning Spring
I was recently asked for advice on how to go about learning Spring, specifically materials that are in addition to the main Spring documentation. This got me thinking as to why some people pick up Spring in a matter of hours while others have to work at it. As with most material there are two parts to Spring, firstly the nuts and bolts of the framework itself and secondly the patterns and practices behind it. This split is so common in computing (and beyond) that it formed the basis of which University that I went to, I found a long time ago that focusing on the patterns, practices and theory gave me skills that transferred easily between different technologies. Spring is no different.
Given this line of thought I broke my reply into two parts, for the nut and bolts of Spring read the first chapter on the Spring documentation 'Core Technologies - The IoC container'. I have not found anything better than this. However do not expect to fully 'get' it on the first pass, or depending on your experiences even the third or forth. Despite Spring's reputed simplicity, it is obscured behind both its own size and as hinted previously 'patterns, practices and theory'. No real prizes for guessing that the main practice behind Spring is Dependency Injection (DI), and Inversion of Control (IOC); they are both referred to frequently within the Spring documentation and dependency injection particularly is a technique that dates back long before Spring and its ilk.
So how does one really learn Dependency Injection and Inversion of Control? The academically inclined may go back to studying Object Orientated (OO) techniques, such as abstraction, encapsulation, polymorphism and things like object cohesion and coupling. The more practically minded person will benefit from doing, and for this I strongly recommend Test Driven Design (TDD), and mocking. Every test that you write following TDD principles is practicing the core design skills needed for efficient and effective use of Spring. When you have mastered the ability to break a design cleanly along its lines of abstraction, then the tests will become focused, simple and robust. At this point the objects being tested will slide into Spring or any other IOC container very easily.
These skills are easy to describe, but difficult to master. Don't be scared to try once, then throw it away and to redo. Most people are scared of the time that this represents, especially if they are working on a project with tight deadlines. However I prefer to view it as an investment, and I make sure that I build time into any estimates that I provide. One can never expect to become quick at this if they always stick to their first design. A form of deadlock happens when we stick to our first designs, we never experiment to find better solutions and eventually we enter a vicious cycle of fire fighting spaghetti code. Once that happens a project will become paralysed. So do yourself a favour, come up with three designs to every problem; and then go with the forth. Lastly I recommend that when you have had a few goes at this, find somebody who is more practiced than yourself and you respect, ask them for feedback. Rome was not built by one man alone.
Given this line of thought I broke my reply into two parts, for the nut and bolts of Spring read the first chapter on the Spring documentation 'Core Technologies - The IoC container'. I have not found anything better than this. However do not expect to fully 'get' it on the first pass, or depending on your experiences even the third or forth. Despite Spring's reputed simplicity, it is obscured behind both its own size and as hinted previously 'patterns, practices and theory'. No real prizes for guessing that the main practice behind Spring is Dependency Injection (DI), and Inversion of Control (IOC); they are both referred to frequently within the Spring documentation and dependency injection particularly is a technique that dates back long before Spring and its ilk.
So how does one really learn Dependency Injection and Inversion of Control? The academically inclined may go back to studying Object Orientated (OO) techniques, such as abstraction, encapsulation, polymorphism and things like object cohesion and coupling. The more practically minded person will benefit from doing, and for this I strongly recommend Test Driven Design (TDD), and mocking. Every test that you write following TDD principles is practicing the core design skills needed for efficient and effective use of Spring. When you have mastered the ability to break a design cleanly along its lines of abstraction, then the tests will become focused, simple and robust. At this point the objects being tested will slide into Spring or any other IOC container very easily.
These skills are easy to describe, but difficult to master. Don't be scared to try once, then throw it away and to redo. Most people are scared of the time that this represents, especially if they are working on a project with tight deadlines. However I prefer to view it as an investment, and I make sure that I build time into any estimates that I provide. One can never expect to become quick at this if they always stick to their first design. A form of deadlock happens when we stick to our first designs, we never experiment to find better solutions and eventually we enter a vicious cycle of fire fighting spaghetti code. Once that happens a project will become paralysed. So do yourself a favour, come up with three designs to every problem; and then go with the forth. Lastly I recommend that when you have had a few goes at this, find somebody who is more practiced than yourself and you respect, ask them for feedback. Rome was not built by one man alone.
Other material
Logging best practices
This post is a brain dump of the practices that have proved useful to me over time while maintaining production Java systems. Please share your experiences, especially if you have come across any great time savers not already mentioned here.
The scale of the system that you are working on does affect how valuable each of the individual practices below would be to you, some of the practices that I mention were introduced due to experiences working with server farms in excess of 150 JVMs running in a very 'chatty' SOA architecture and so those practices may not be as valuable in, say a single process Swing application. However I have come to the conclusion that once these practices are wrapped into a reusable set of classes and bedded into the mind set of the developers then they cost next to nothing to use; and so I recommend putting in place at least a placeholder for each practice at the start of a project as retrofitting large code basis is often more expensive than putting in a good placeholder early.
The more effort and time involved in gaining access to a systems log file(s), then the natural consequence is that the log file will be used less frequently. I have worked in environments where developers had to request a copy of the production log files in order to begin investigation into a problem. Such requests would take between 1 hour and 3 days to be actioned. This barrier to entry blocks all pre-emptive problem solving from happening, as well as gaining timely feedback from experiments used to reproduce and track down a problem.
My ideal is to have real time search access to all production logs, tools suck as Splunk make this very straight forward. Especially as it is capable of combining logs from multiple servers. If a licensed solution is not viable then roll up your sleeves and write a few scripts to achieve the same results.
Each individual message needs to be complete in its own right. That is, a call to the LOG class should be one per key event being reported on. All of the
information for that event must be contained in that call, such as stack traces and values of key data that describe the uniqueness of the event that has just occurred. Do not place stack traces in a separate call to the log than the message describing its context.
If the information or data is spread across multiple calls to the LOG class then it becomes likely
in a multithreaded environment that the messages will get interlaced making them more difficult to decipher. Naturally this interlacing does not occur often during development cycle as a developer working by him/her self tends to only send one request into the system at a time.
A single line in the log file describes a single event, a series of events tells a story, and in order to understand the meaning of the story it needs to flow; the relationships between each event needs to be clear and concise.
To achieve this with Log4J I use the MDC class to push context information that is important to the story being described, such as: the name of the user who made the request, a unique request id that was generated when the request started, the thread id at time of the message being generated, and the security principle that the request was running as at that time. In environments consisting of multiple co-operating servers you may also find the following values useful in the MDC: the name of the SOA service that generated the logged event, the machine name where the logged event was generated. Which pieces of data you place on the MDC will depend on your context, remember to make sure that each item pays for itself. It is not a free journey, and we don't want to give away any free tickets.
In multi machine environments being able to tie the story together becomes harder, and yet all the more important. Consider passing some of the values that you you push onto he MDC between machine boundaries as part of the remote calls, I make it a matter of point to always pass the user name and request id across machine boundaries. This means that if an error is reported to the user, they can then be given an incident number to report to the support staff. This incident number is the request id mentioned previously and will allow a developer to reconstruct the full story of what the user was doing as the error occurred from the log files.
On projects that I have worked on that have had no compile time warnings, it has always been easy to spot when one has introduced a warning and not wanting to be the one to be caught 'peeing in the proverbial pool' I tend to fix new warnings fairly promptly. If on the other hand there are lots of warnings already in place then my brain shuts off and I no longer notice them, I won't carry on the previous comparison but I do find that after a build has more than a few warnings they quickly start to multiply with out even being noticed. The same phenomena happens with errors in production logs, it is very easy to spot and respond to them when they are rare however when they have become frequent then they become the norm and in my experience it is a significant sign that the project is struggling with a vicious circle that needs to be broken. If you are lucky to be working on a green field project, then it is best to start this practice early.
The log files are part of the interface to a system, think of it as an API in its own right. They may not be used by the end customer, but in circumstances when they are used to support the end customer then they do have an audience. To ensure quality of any interface to a system it must be both tested and used frequently. Failure to perform these checks results in the rapid build up of dust, decay and entropy. No body enjoys house cleaning, so automating these tests can greatly reduce the burden. The usual suspects apply here, unit testing, mocking of the Log interface and TDD.
Tools such as log4j support multiple coarse grains of categorising the messages logged. It makes a significant difference to the quality of the log files if developers agree when to use each of the common log levels early on in the project, and adhere to them. This consistency helps to reduce spam in the log files and helps turn off the noise when going into production. Personally I start off using DEBUG for all of my messages and then increase its level when I can describe the benefit for doing so. The following table captures the type of explanations that I look for with each logging level. This table is intended to help get thoughts flowing, and is by no means a summary of the only rules that a team could use.
Logging is not free. It costs the developer time to add it to a system, it costs the system time to execute the log statements and it costs the poor soul who has to investigate/support the system time to understand the logs. As a consequence of this think twice before logging a new event, is it really needed. If it is then how expensive is it to produce the event and how often will it appear in the logs?
To make processing logs easier write to the logs in a consistent format. When embedding data into the logs provide consistent names for the data etc.
For example, one approach that I have worked with is to output log messages like this:
As discussed under 'Minimising logging overheads' we want to avoid spamming the log files. However when diagnosing a problem we need data, when reproducing problems we sometimes need a lot of data. For this reason it is very useful to have a mechanism that can increase the log levels on a machine, or group of machines without having to reboot the machine. The most flexible approach is to be able to do this per user; remember to not give untrusted access to such a mechanism.
You do not want to run out of disk space in production. Monitor the amount of data on the disk drives, and archive off logs for future reference.
Log4J
Custom shell scripts written in grep/sed/awk, perl, ruby, etc
Splunk
The Practices
The scale of the system that you are working on does affect how valuable each of the individual practices below would be to you, some of the practices that I mention were introduced due to experiences working with server farms in excess of 150 JVMs running in a very 'chatty' SOA architecture and so those practices may not be as valuable in, say a single process Swing application. However I have come to the conclusion that once these practices are wrapped into a reusable set of classes and bedded into the mind set of the developers then they cost next to nothing to use; and so I recommend putting in place at least a placeholder for each practice at the start of a project as retrofitting large code basis is often more expensive than putting in a good placeholder early.
- Easy log file access
- Self contained messages
- Logs telling the Story of what happened
- Zero tolerance to errors
- Logging as part of QA
- Logging levels
- Minimising Logging Overhead
- Machine parsable logs
- Runtime adjustable levels
- Archive the logs
Easy log file access
The more effort and time involved in gaining access to a systems log file(s), then the natural consequence is that the log file will be used less frequently. I have worked in environments where developers had to request a copy of the production log files in order to begin investigation into a problem. Such requests would take between 1 hour and 3 days to be actioned. This barrier to entry blocks all pre-emptive problem solving from happening, as well as gaining timely feedback from experiments used to reproduce and track down a problem.
My ideal is to have real time search access to all production logs, tools suck as Splunk make this very straight forward. Especially as it is capable of combining logs from multiple servers. If a licensed solution is not viable then roll up your sleeves and write a few scripts to achieve the same results.
Self contained messages
Each individual message needs to be complete in its own right. That is, a call to the LOG class should be one per key event being reported on. All of the
information for that event must be contained in that call, such as stack traces and values of key data that describe the uniqueness of the event that has just occurred. Do not place stack traces in a separate call to the log than the message describing its context.
If the information or data is spread across multiple calls to the LOG class then it becomes likely
in a multithreaded environment that the messages will get interlaced making them more difficult to decipher. Naturally this interlacing does not occur often during development cycle as a developer working by him/her self tends to only send one request into the system at a time.
Logs telling the Story of what happened
Context is King
A single line in the log file describes a single event, a series of events tells a story, and in order to understand the meaning of the story it needs to flow; the relationships between each event needs to be clear and concise.
To achieve this with Log4J I use the MDC class to push context information that is important to the story being described, such as: the name of the user who made the request, a unique request id that was generated when the request started, the thread id at time of the message being generated, and the security principle that the request was running as at that time. In environments consisting of multiple co-operating servers you may also find the following values useful in the MDC: the name of the SOA service that generated the logged event, the machine name where the logged event was generated. Which pieces of data you place on the MDC will depend on your context, remember to make sure that each item pays for itself. It is not a free journey, and we don't want to give away any free tickets.
Context across machine boundaries
In multi machine environments being able to tie the story together becomes harder, and yet all the more important. Consider passing some of the values that you you push onto he MDC between machine boundaries as part of the remote calls, I make it a matter of point to always pass the user name and request id across machine boundaries. This means that if an error is reported to the user, they can then be given an incident number to report to the support staff. This incident number is the request id mentioned previously and will allow a developer to reconstruct the full story of what the user was doing as the error occurred from the log files.
Zero tolerance to errors
On projects that I have worked on that have had no compile time warnings, it has always been easy to spot when one has introduced a warning and not wanting to be the one to be caught 'peeing in the proverbial pool' I tend to fix new warnings fairly promptly. If on the other hand there are lots of warnings already in place then my brain shuts off and I no longer notice them, I won't carry on the previous comparison but I do find that after a build has more than a few warnings they quickly start to multiply with out even being noticed. The same phenomena happens with errors in production logs, it is very easy to spot and respond to them when they are rare however when they have become frequent then they become the norm and in my experience it is a significant sign that the project is struggling with a vicious circle that needs to be broken. If you are lucky to be working on a green field project, then it is best to start this practice early.
Logging as part of QA
The log files are part of the interface to a system, think of it as an API in its own right. They may not be used by the end customer, but in circumstances when they are used to support the end customer then they do have an audience. To ensure quality of any interface to a system it must be both tested and used frequently. Failure to perform these checks results in the rapid build up of dust, decay and entropy. No body enjoys house cleaning, so automating these tests can greatly reduce the burden. The usual suspects apply here, unit testing, mocking of the Log interface and TDD.
Logging levels
Tools such as log4j support multiple coarse grains of categorising the messages logged. It makes a significant difference to the quality of the log files if developers agree when to use each of the common log levels early on in the project, and adhere to them. This consistency helps to reduce spam in the log files and helps turn off the noise when going into production. Personally I start off using DEBUG for all of my messages and then increase its level when I can describe the benefit for doing so. The following table captures the type of explanations that I look for with each logging level. This table is intended to help get thoughts flowing, and is by no means a summary of the only rules that a team could use.
Level | Usage | Expected action | Example |
FATAL | Reporting problems that until fixed will cost the company money and will carry on costing the company money until resolved. Once a problem has been reported do not immediately repeat the same message. | Fix ASAP. Even if that means waking people up in the middle of the night. As such false alarms and frequent alerts will not be taken to kindly. | The database is down and all user requests are being rejected. |
ERROR | A problem has occurred that was not automatically recovered, but it is not critical to the revenue stream of the company. | In low to medium volumes ERROR messages will trigger investigation in a timely manner. High volumes may be escalated to FATAL. NB: A user entering 'foo' into a number only field should never be reported as an error at this level, it should be recovered automatically at the GUI level. | Updating a users details failed due to an unexpected database error. |
WARN | A problem has occurred that was both unexpected and automatically recovered. | Actively monitor with the goal of pre-empting problems that could escalate. Frequent reoccurring problems will need to be fed back to development for resolution. | Connection to the exchange rate server has gone down but it is not causing an immediate problem due to caches or disk space is down to the last 10% etc. |
INFO | Report key system state changes and business level events. | Used to answer infrequent queries about the behaviour of the system and its users. | Jim deposited 53 pounds sterling. |
DEBUG | Detailed information about each request. | Used to investigate the causes of a problem. Will usually be disabled in production by default as it will output a lot of contextual information. | Value of field x is 92.4, or 'starting process Y' and 'finished process Y in 95ms' |
Minimising Logging Overhead
Logging is not free. It costs the developer time to add it to a system, it costs the system time to execute the log statements and it costs the poor soul who has to investigate/support the system time to understand the logs. As a consequence of this think twice before logging a new event, is it really needed. If it is then how expensive is it to produce the event and how often will it appear in the logs?
- don't log unless the event adds to the story being told, what you don't know what the story is? go back to context is king and do not collect two hundred pounds
- if the event is very common, bulk them together; don't spam
- if the message to be logged is a constant string use LOG.level("msg")
- if the message requires much string comparisons and runtime performance is important us isXXXEnabled
- to improve readability and performance I sometimes use an interface to reduce the number of ifs; also useful if the logs are to be translatable
Machine parsable logs
To make processing logs easier write to the logs in a consistent format. When embedding data into the logs provide consistent names for the data etc.
For example, one approach that I have worked with is to output log messages like this:
[user=Jim] has logged on.
Runtime adjustable levels
As discussed under 'Minimising logging overheads' we want to avoid spamming the log files. However when diagnosing a problem we need data, when reproducing problems we sometimes need a lot of data. For this reason it is very useful to have a mechanism that can increase the log levels on a machine, or group of machines without having to reboot the machine. The most flexible approach is to be able to do this per user; remember to not give untrusted access to such a mechanism.
Archive the logs
You do not want to run out of disk space in production. Monitor the amount of data on the disk drives, and archive off logs for future reference.
Useful Tools
Log4J
Custom shell scripts written in grep/sed/awk, perl, ruby, etc
Splunk
Labels:
best practices,
java,
logging
Saturday, 8 January 2011
Performance comparison of Java2D image operations
Abstract
The motivation for this blog item is to capture notes during some spikes that I made while figuring out which approach to manipulating photographs performs the best in Java2D. For the purposes of this article I am reducing the brightness of a picture as a test case, a simple operation that merely requires the colour values of each pixel to be divided by two. More complicated algorithms can be built up upon the back of this article once we have an understanding of which approach to reading and writing pixels performs the best.
The Test Setup
The test picture
A 11.7Mb png file, measuring 3296 pixels in width and 2472 pixels in height.
The test machines
- Mac Airbook
- Intel Dualcore 2.1Ghz 2Gb ramNv8d8an GeForce 9400M
- Win Desktop
- AMD DualCore 2.6Ghz 2Gb ram Nvidia Quadro FX1500Both machines are running Java 1.6.
The Code Spikes
Eight different approaches to accessing pixels in Java2D will be described below, with the results of running them on two different machines provided; One a windows based machine and the other OSX. Both running Java 1.6. The JVM will also have an effect on the performance of each of these algorithms, to keep this article focused I will keep all of the JVM options set to default.Approach 1
Read a single pixel in as a series of integers from WritableRaster.getPixel, and write them out using setPixel. An instance of WritableRaster can be retrieved from BufferedImage.getRaster().final WritableRaster inRaster = inImage.getRaster(); final WritableRaster outRaster = outImage.getRaster(); int[] pixel = new int[3]; for ( int x=0; x<imagewidth ; x++ ) { for ( int y=0; y<imageHeight; y++ ) { pixel = inRaster.getPixel( x, y, pixel ); pixel[0] = pixel[0]/2; pixel[1] = pixel[1]/2; pixel[2] = pixel[2]/2; outRaster.setPixel( x, y, pixel ); } }
- MacOSX (airbook)
- 718 ms
- Windows XP
- 985 ms
Approach 2
Read each pixel in as an encoded integer from BufferedImage.getRGB(x,y), and write them out using setRGB.
for ( int x=0; x<imagewidth ; x++ ) {
for ( int y=0; y<imageHeight; y++ ) {
int rgb = inImage.getRGB( x, y );
int alpha = ((rgb >> 24) & 0xff);
int red = ((rgb >> 16) & 0xff);
int green = ((rgb >> 8) & 0xff);
int blue = ((rgb ) & 0xff);
int rgb2 = (alpha < < 24) | ((red/2) << 16) | ((green/2) << 8) | (blue/2);
outImage.setRGB(x, y, rgb2);
}
}
- MacOSX (airbook)
- 1495 ms
- Windows XP
- 2219 ms
Using getRGB is twice as slow as getPixel, and provides no particular benefits. Lets avoid this approach and see if we can optimise getPixel any.
Approach 3
As Approach 1 however rather than reading a pixel as a set of three integers, it reads them in as three floats.
final WritableRaster inRaster = inImage.getRaster();
final WritableRaster outRaster = outImage.getRaster();
final WritableRaster inRaster = inImage.getRaster();
final WritableRaster outRaster = outImage.getRaster();
float[] pixel = new float[3];
for ( int x=0; x<imagewidth ; x++ ) {
for ( int y=0; y<imageHeight; y++ ) {
pixel = inRaster.getPixel( x, y, pixel );
pixel[0] = pixel[0]/2;
pixel[1] = pixel[1]/2;
pixel[2] = pixel[2]/2;
outRaster.setPixel( x, y, pixel );
}
}
- MacOSX (airbook)
- 901 ms
- Windows XP
- 1203 ms
A little slower than the reading the values per component. This surprised me, either the noise of the test has hidden the improvement or there is some overhead here that is not good. Further trials are needed to differentiate these two possibilities. But before we explore this further lets try the approach provided for by Java2D, the ConvolutionOp. Approach 4
Swing provides a class specifically designed for convolution operations, such as reducing the brightness of a picture. Convolution is the term given to a group of image processing algorithms that average a group of pixels together to create a new value for a single pixel. The following code uses a 1x1 matrix to take an average of only the pixel that will be replaced. float[] DARKEN = {1.0f};
Kernel kernel = new Kernel(1, 1, DARKEN);
ConvolveOp cop = new ConvolveOp(kernel,ConvolveOp.EDGE_NO_OP, null);
cop.filter(inImage, outImage);
- MacOSX (airbook)
- 184 ms
- Windows XP
- 172 ms
The convolution implementation provided by Swing was significantly faster than the per pixel baseline that was tried first. This was only to be expected given that the Sun Engineers would have spent time tuning the code for precisely this type of use. Approach 5
To see if the per pixel approach can be improved on I took the faster of the two approaches tried which used integers and used the getPixels method that is capable of reading multiple pixels at a time. This spike will show whether there is much overhead in accessing the pixels one at a time via getPixel verses a bulk fetch and set of multiple pixels. The batch size has been set to match the width of the picture. final WritableRaster inRaster = inImage.getRaster();
final WritableRaster outRaster = outImage.getRaster();
int[] pixels = new int[3*imageWidth];
for ( int y=0; y<imageheight ; y++ ) {
pixels = inRaster.getPixels( 0, y, imageWidth, 1, pixels );
for ( int x=0; x<imageWidth; x++ ) {
int m = x*3;
pixels[m+0] = pixels[m+0]/2;
pixels[m+1] = pixels[m+1]/2;
pixels[m+2] = pixels[m+2]/2;
}
outRaster.setPixels( 0, y, imageWidth, 1, pixels );
}
- MacOSX (airbook)
- 534 ms
- Windows XP
- 578 ms
Approach 6
Reading an entire row of pixels in at a time was faster than accessing a single pixel at a time. However it is still not approaching the performance of the ConvolutionOp. Perhaps fetching two rows at a time will be faster still? final WritableRaster inRaster = inImage.getRaster();
final WritableRaster outRaster = outImage.getRaster();
final WritableRaster inRaster = inImage.getRaster();
final WritableRaster outRaster = outImage.getRaster();
int[] pixels = new int[3*imageWidth*2];
for ( int y=0; y<imageheight ; y+=2 ) {
pixels = inRaster.getPixels( 0, y, imageWidth, 2, pixels );
for ( int x=0; x<imageWidth; x++ ) {
int m = x*3;
pixels[m+0] = pixels[m+0]/2;
pixels[m+1] = pixels[m+1]/2;
pixels[m+2] = pixels[m+2]/2;
int n = m+imageWidth*3;
pixels[n+0] = pixels[n+0]/2;
pixels[n+1] = pixels[n+1]/2;
pixels[n+2] = pixels[n+2]/2;
}
outRaster.setPixels( 0, y, imageWidth, 2, pixels );
}
- MacOSX (airbook)
- 429 ms
- Windows XP
- 453 ms
Approach 7
Reading in two rows of pixels at a time was for the most part faster than reading a row at a time. As with all of the timings taken Java varies greatly each time, so out of curiosity I wanted to know how much slower processing half a row at a time would be. final WritableRaster inRaster = inImage.getRaster();
final WritableRaster outRaster = outImage.getRaster();
int halfWidth = imageWidth/2;
int[] pixels = new int[3*halfWidth];
for ( int y=0; y<imageheight ; y++ ) {
pixels = inRaster.getPixels( 0, y, halfWidth, 1, pixels );
for ( int x=0; x<halfWidth; x++ ) {
int m = x*3;
pixels[m+0] = pixels[m+0]/2;
pixels[m+1] = pixels[m+1]/2;
pixels[m+2] = pixels[m+2]/2;
}
outRaster.setPixels( 0, y, halfWidth, 1, pixels );
pixels = inRaster.getPixels( halfWidth, y, halfWidth, 1, pixels );
for ( int x=0; x<halfWidth; x++ ) {
int m = x*3;
pixels[m+0] = pixels[m+0]/2;
pixels[m+1] = pixels[m+1]/2;
pixels[m+2] = pixels[m+2]/2;
}
outRaster.setPixels( halfWidth, y, halfWidth, 1, pixels );
}
- MacOSX (airbook)
- 418 ms
- Windows XP
- 453 ms
This time the spike was faster than processing two rows of pixels at a time, so it would appear that two rows is faster than one row at a time and half a row is faster still. Huh? What is going on here? It would appear that the noise in the timing of the operations is greater than the performance improvement seen by varying the number of pixels processed at a time. Clearly reading multiple is preferable to one at a time, but after that it is not competing significantly with the ConvolveOp which is still King. Approach 8
Investigating the getPixel methods was interesting but if we are to see a significant improvement in performance a totally different approach is going to be needed. For the last spike I tried to bypass as many of Java2Ds layers as possible and access the picture data directly via the DataBuffer class. It is still a long way from the hardware, as is common in Java however it will give us an idea how much overhead the Java2d classes BufferedImage and WritableRaster add. // For this approach to work it is important that both DataBuffers use the same picture encoding under the hood as each other,
// otherwise the picture will corrupt
final WritableRaster outRaster = inImage.getRaster().createCompatibleWritableRaster(imageWidth, imageHeight);
DataBuffer in = inRaster.getDataBuffer();
DataBuffer out = outRaster.getDataBuffer();
int size = in.getSize();
for ( int i=0; i<size ; i++ ) {
out.setElem( 0, i, in.getElem(0, i)/2 );
}
BufferedImage outImage = new BufferedImage(inImage.getColorModel(), outRaster, true, null);
- MacOSX (airbook)
- 71 ms
- Windows XP
- 63 ms
Conclusions
Comparing performance of different image processing approaches in Java has been a challenge. Java is unable to give anything close to constant time for processing the same image, during the coarse of running the above code fragments I saw variations ranging from 700ms to 7000ms. To help smooth the results I placed a call to System.gc() between each spike and reran the tests many times to bed the system in and to take an average ignoring any really wild values. This behaviour makes Java very unsuitable for any type of real time image processing applications. The variance of Java's performance being said, there are some very clear trends in the results. Specifically if you are implementing a convolution algorithm and performance is not your key concern then the ConvolveOp is excellent, however if performance is vital then with some extra effort in understanding the encoding of the DataBuffer used by your image then you can get a 2-5x performance boost over ConvolveOp by accessing the DataBuffer directly. If the algorithm that you are working with does not boil down to a convolution then you will do okay to read in a line of pixels at a time using getPixels (avoid getRGB) however the clear winner was the DataBuffer. After thoughts
I wrote this article out of curiosity about Java's image processing capabilities, it is widely recognised that there are better languages for the job. I have found it possible to do reasonable image processing effects for Java applications in pure Java but I would not consider it for anything that needed real time response times. Staying in the Java realm for the moment it would be interesting to also compare SWT with the approaches tested in this article, as well as using JNDI access to hardware such as graphics cards. However at this point of accessing hardware it removes the main reason why I considered Java; platform independence. Perhaps I will try C# or good ol' C next. Appendix
Labels:
java
Subscribe to:
Posts (Atom)