Requirements for Waterfall, Agile and Cowboy

Once again, I am comparing these three approaches to software development. It might seem funny that I could treat “cowboy” like it is a real methodology, but there is some reality to it and I would like to explain how & why it works.

First, let’s compare how each one handles requirements:

Waterfall – You gather all of your requirements before you start developing. Hopefully, you can gather them all correctly, so the project comes-out perfect at the end. It only seems to work if you have skilled & experienced requirements takers and givers. Even then, you need some amazing luck, or low standards or something. Otherwise, at the end, you will discover all of the stuff that you missed, or got wrong. For this reason, people often refer to waterfall as “water-fail”. Of course, that nickname is only funny if you haven’t been burned by it, or your scars have healed.

Agile – You gather some requirements and start working with the ones you have. As the developers are working, the business analysts gather more requirements. You get short bursts of work done, at which time, you discover which requirements were wrong or incomplete. You add those into the next development cycle. Your cycles need to be rather small (1/2 – 2 months for each dev cycle). After failing and correcting enough times, you eventually are likely to get everything right. YMMV.

Cowboy – You don’t really know how to gather requirements. You just take your best guess and start writing a program. Once you have enough working, you show it to the customer/users. They try to figure out how to use your program and give you feedback about what is not working or is awkward, or insufferable. You can’t easily distinguish between which ones are bugs and which are requirements. It doesn’t really matter because it is all part of your “to do” list. Once your “to do” list is empty, you are done.

There is one common thread between all three of these. They all gather requirements and do testing. Some are more formal, and optimistic about their ability to gather requirements. Others are (perhaps) more realistic, and acknowledge that the end-users are going to see some bugs.

The big differentiators are 1) how good are you going to be at giving/getting requirements, and 2) who does the most testing and catches the most bugs.

The funny thing is that, in reality, Waterfall usually reverts to cowboy at the end. Cowboy usually starts with a mini-waterfall and Agile seems to rock back-and-forth between mini-waterfall and mini-cowboy. In fact, most agile projects are mostly mini-waterfall or extra-fancy cowboy.

My point is this: Plenty of teams are able to be successful (to varying degrees) with each of these. The key is to know your strengths and pick an approach that plays to your strengths. The biggest cause for failing at one of these, is thinking that you are the wrong one, and struggling to avoid the way which fits you best.

Advertisements
Posted in Methodology | Tagged , , | Leave a comment

Performance Testing – 201

In my career, I’ve done a few performance studies. (more than just a few). (I’m trying to be humble, but it’s not working). (Sorry).

I’ve had some good mentors along the way, and I’ve done a lot of studying to learn what resources are common bottlenecks, how to detect them, and what can be done about each one. It sounds pretty easy, but sometimes it is harder than it sounds.

A few years back, my team was implementing a new service. It would get a lot of traffic and needed to perform well.  I started talking to one colleague about how to test the performance of this stuff. He stopped me after a few seconds and said that he already knew all about this stuff, and I should step aside so he could bust-out a quick performance study, and prove that everything was performing great. The way that he said it (in my experience) is usually “a tell”. No worries.  I stepped back and waited to be impressed.

First he came back with an answer like “Yep, it works great. What’s next”. So I asked him for some statistics to back his claim. He left and came back in a few hours with a graph that looked like this:

Graph-bland-zoomed

I was like, “um, what am I looking at?”
He was like, “Performance graph”
I was like, “No. What does this mean? There are no descriptions.”

So he went away for a few minutes and came back with this:

Graph-bland

I said that I liked the descriptions, but I didn’t find it very convincing. How did that prove that our system was performing well and would scale nicely?

He said “Look. Everything is at the bottom. Nothing is over 40%. We are good-to-go”.

He didn’t quite understand why I wasn’t satisfied yet. So I elaborated: Flat lines don’t tell you anything. If none of those lines reach the top, then your test hasn’t confirmed anything about your capacity. You need to exercise the system while you measure it. This will help you identify your bottlenecks. And don’t tell me that there are not bottlenecks, because every system has them. Some are narrow, some are spacious and some are gigantic. You still need to 1) perform tests which reveal the capacities of a system. 2) monitor the resources which are most-likely to be your bottlenecks and affect performance.

Most relevant metrics:

From my experiences, you will usually find what you need when you to measure these resources:

  • Processor – Total, (or each processor ONLY if there are 8 or less. It is difficult to read a graph of 32+ simultaneous CPUs)
  • Logical Disk – % Disk time – for each disk/controller
  • Logical Disk – Current Disk Queue Length – for each disk
  • Memory – % committed bytes in use
  • Memory – Page faults/sec
  • Network interface – bytes total/sec
  • Network interface – output queue length
  • Objects – Threads
  • Physical Disk – % disk time
  • Process – Handle count
  • Process – Page faults/sec
  • Processor – Interrupts/sec
  • Server work queue – active threads
  • Server work queue – Queue length
  • Server work queue – Total operations/sec
  • System – processor work queue length
  • System – Processes

Method:

  1. Use PerfMon from a different machine (NOT THE MACHINE THAT YOU ARE MONITORING). Of course, this will cause extra network usage, but it is negligible compared to what is happening on the machine that you are testing. While PerfMon is collecting and graphing this data, it will be very busy. Which is why you don’t want to run it from a machine that you are monitoring.
  2. Set up PerfMon to record its data to a file. That way, you can take your time to analyze each metric, individually, later. Sometimes it is necessary to zoom-in-on segments of a graph, especially during interesting time periods of a test (peaks, gaps). You won’t be able to do this effectively real-time (during a test).
  3. Try several stress/capacity tests with variances
    1. A human doing normal usage (baseline)
    2. Five humans testing rapidly
    3. One bot working very rapidly
    4. Five bots working very rapidly
    5. Twenty five bots working rapidly
  4. During your capacity tests, measure the following
    1. Elapsed time for each action (round-trip, page, response, etc)
    2. Measure how many actions-per-minute can be performed for each action
    3. Detect if any errors happened during your tests. Were they the byproduct of stressing your system? What was the threshold (capacity) at the time that you broke it.
    4. If you break the system, what was the error. Can it be correlated to a line of code, database table, or external system?

Analysis:

  1. Any metric with a name like Queue or Queue Length is best if it is zero. If it ever is above zero, that means the system is waiting for a resource (bottleneck). If a queue length spikes up, that is an indicator that you are above capacity for that resource. (Disk, network, processor)
  2. Ideally, your CPUs should reach 100% for much of a capacity test. If your CPUs never reach 100% (maybe only 95% or 98% ceiling) that is bad, because something is bottlenecking your system and preventing it from full utilization. You need to find the culprit.
  3. If your CPUs seem to do a little dance, where one is up while the other is down, and they seem to be mirrors of each other, then that means your processes are single-threaded and you are not running enough simultaneous work, or your system is single-threaded and you have serious problem.
  4. Memory, objects, threads, should generally be flat or hit a plateau. If they go steadily up, without leveling-off, then you might have a resource leak and you need to find it.
  5. Compare your different test runs to determine how many records could be processed during peak utilization.
  6. If your tests broke your app, what line(s) of code, or external resources are responsible

Bottom line: See #2.  Your best outcome is: having all of your CPUs spiked. If you have that, you are probably good. Otherwise, you have more work.

In my experience, you are most likely to observe a bottleneck because of drive latency or a database that requires better tuning (indexing).  Database tuning is pretty easy.  With the falling cost of SSDs, even drive latency is easy to overcome.

Performance studies can be pretty easy, once you know what you are doing.  Start with the right objective, measure the right resources, find your capacities, and then you may announce that you are good to go, and I might just believe you (maybe).

 

Posted in Optimization, Review, Testing | Tagged , , , | Leave a comment

Comparing yourself to others

This is a frequent topic with my kids and even some peers. “I’m not good enough” or “I wish I could be like that person. They have it all”. “I don’t think I could ever achieve what that person has.”

When I was in college I did an internship during my senior year. The workplace was pretty high-tech and I really admired my boss. I remember agreeing with the other interns, “someday I want to be just like that guy”. I also remember how disappointed I was, one year later, when I was asking my boss for some advice about a program and he confessed that he didn’t know. In one small year, I had already surpassed my boss (on a technical level). I had already achieved my “someday” goal. So now what? Which way was upward and onward? It took me a few days to get over it and pick a new goal.

Your best goal is always going to be improving yourself, perpetually. It will help you grow, and make you into a better you. You never know what you can achieve until you really stretch yourself. It sounds like a stupid goal, “just be better than you were yesterday”. After all, every day you are unlikely to be less than you were yesterday. Duh. Right?

The key is not to measure yourself in millimeters, but in inches or feet. Every day or every week, think about greater things that you could be doing or learning, or even trying.  Make sure that you have chosen something (and not just coasting).  Make a point of working on yourself and check that you are showing measurable progress. Have criteria for your growth. Pick a good stride and stick with it. That should be your goal: to keep up with your targeted pace.

One last thing. Be careful about picking unachievable goals. Although “a journey of a million miles begins with one footstep”, nobody really just goes on a walk for a million miles, and survives. Likewise, it isn’t very helpful to compare yourself to someone like Warren Buffet, or Bill Gates or Arnold Schwarzenegger. You must know that you will not be able to top folks like those. Face it, any person could only become “so much” like Michael Jordan or Justin Timberlake. You still need to be realistic about physical barriers (money, anatomy, intellect).

The point of the statement (about journeying a million miles), is to pick a direction, get up and get going. Instead of saying “nobody can travel a million miles”, you should be thinking “a hundred miles is challenging, but possible for me. How would I do it?” and then build a plan and get going.

This is the best way to compare yourself to others. Don’t aim too low, or too high. Pick a good pace and persevere. Be prepared to pick a better goal (if you need to). Ready, go!

Posted in Lessons Learned, Personal | Tagged , | Leave a comment

Why your web server needs a data mart

The other day, I had a fun discussion with a colleague. We talked about data marts. Just in case you are not familiar with the term, let me take a moment to describe the concept.

A data mart is basically a small database that contains a sub-set of your company’s data. It is a copy.  On the surface, it might sound like a waste of time. After all, maintaining two sets of data, can be challenging and it is an opportunity for mistakes.  So why bother?

Background

Most IT systems (programs) store their data in some kind of database. Some systems are (primarily) meant for gathering data and some are meant for displaying data. Big companies tend to have very large databases. Getting useful data out of them can be slow. Inserting data into a big DB can slow-down other people who are trying to read data. On a web site, you don’t want “slow”. There are of course, ways around this.

Managing/Dealing-with Large Data

When you think about data, it is best to think of your data like your money. Keeping it is good. The more (useful data) you gather, the more power it can provide to you. You always need some of it at-hand. Once you gather a bunch, you probably want to start putting some away (like a 401k). Your goal will be to accumulate it steadily. The expectation is that it will be more valuable someday. Having a good plan is important, so you can get a good value out of it someday.

Here is something that you probably don’t want to do: carry all of your money around with you all of the time. Why? Because you don’t want someone to steal it, it is probably big and bulky, and you don’t really need it right now. Just put it in the bank and you can get to it when you need it.

Data is like this too. On your web server, it is pretty rare for someone to need all of the data that your company has been accumulating. You usually only need current data, or a few summaries, and maybe sometimes, you need a big chunk, but not all of the time.

Solution: Data Mart

A data mart is basically, like a wallet full of data or maybe a debit card for data or something like that. It is a smaller database that only contains the data that your web server is typically going to use on a day, for one system. You don’t have your debit card connected to your 401 k, right? You also don’t need your web server connected to all of your data. Your data is also probably pretty bulky and slow. Maybe bad people would like to do evil stuff with it, so you should protect it and only carry-around the data that you actually need.

To summarize, a data mart for your web server, benefits you by:

  • Isolating traffic – web server demand is isolated from your main DB. So all web traffic doesn’t affect your main DB, and traffic to your main DB doesn’t slow-down your web server. This protects you against a DDOS attack.
  • Smaller data = faster – Certainly, it is much faster to query a smaller amount of data. This protects you against normal, healthy traffic and yields quick responses.
  • Less data = less exposure – If your web server ever becomes compromised (hacked), the culprits are likely to get everything that is on the server, and might get everything connected to it. If you plan your systems for this possibility, you will see that this defensive posture (of having less data) minimizes the damage which could occur from a data breach.

Bottom line: keep different (isolated) systems for your internal users, and external users. It takes a little more thinking, planning, and equipment, but it is much better than walking around with a big sack of money.

Posted in Architecture, Database, Optimization | Tagged , , , | Leave a comment

Value of a senior developer – part 3

In part 1, I talked about some unusual interview questions that were posed to me: 1) What does a company get from a senior developer for all of that extra cost? 2) You have a lot of experience with leadership, but we don’t need that. Can you just be a really good developer and not try to lead?

I suppose in parts 1 & 2, I mostly went-into: why those are not good questions, but didn’t exactly answer them or explain. So, here is a quick list of things that I would expect a senior developer to know, which go above the skill-set of a mid-level developer.

  • Start a project (correctly)
  • Finish a project (correctly)
  • Provide reliable estimates
  • Operate in a predictable/reliable manner
  • Assess other developers
  • Assess the skills of a business analyst, tester or project manager (objectively)
  • Be self-aware and know his own strengths/weaknesses and play to each of them
  • Advise management about options (technologies, dev practices, security, logic, paperwork)
  • Inherently prepare contingency plans
  • Anticipate risks and other problems in a project
  • Know when to go cowboy (in emergencies) and when to be patient
  • Know other technical people’s jobs: DBA, server admin, security admin, software architect
  • Show good judgment about which work should be done by the senior developer, and which work should be delegated
  • The buck stops with him, with-regard-to software development & problem solving
  • Know when something is over his head and when to get help

I could add a lot of other skills to this list, but I think you get the picture. A senior developer is not just someone with a lot of experience, or gray hair, but rather, he is someone who thoroughly knows software development projects and can entirely handle things himself. He is the leader and he is reliable. He knows what to do. He also can fill missing/empty roles if/when necessary.

If any of the skills on that list don’t seem like a big-deal to you, then you should speak with a senior developer about what goes wrong, when a person cannot perform those tasks. I could write two anecdotal stories about each of them, to prove how important they are, and why they are harder than they sound. The more skilled/experienced your senior developer, the higher-quality you will get out of each of those. Your projects should purr like a Cadillac.

Btw, if you have projects that don’t go-well, and you are not sure why, then you might want to use this list as an evaluation “check list”. Use it to evaluate your projects and see how many of them are contributing to your problems.

Posted in Career, Professionalism | Tagged , , , | Leave a comment

Emerging from technical bankruptcy

The process for emerging from technical bankruptcy is actually similar to financial bankruptcy.  If you decide to pay-down the technical debt, instead of starting over, it means you need to change your current practices.  You have been living beyond your means and you have been focusing your resources in the wrong place.

To actually pay-down your tech debt, you absolutely must cut-back on other projects and assign a higher priority to the process of paying-down-debt. If I’m making it sound easy, let me correct that notion.  This is going to require a significant effort.

On the other hand, if you think you will do better by starting over, think again.  Before you try it, you really need to take a deep look at yourself and your processes, so you don’t arrive at the same point, over-and-0ver again.

It is a long road back and it begins with things like “living within your means”, being realistic about your ability to produce and maintain software.  Committing to improving your processes and paying attention to quality.

It is a lot of work and a lot of commitment, but you can do it. Keep your head above water.  Live within your technical means.  Don’t accumulate technical debt.  Stick with these rules and some day, you won’t have to spend all of your resources, paying interest. Remember: Accumulating debt and never paying it down, will lead to one place: bankruptcy.  Don’t go there.

 

Posted in Methodology | Tagged , , | Leave a comment

Management will prefer more tech debt (in so-many words)

Disclaimer: This is not about my current project or company. It is about all projects and companies, everywhere.

If you are familiar with “technical debt”, I’m sure you would acknowledge: tech debt is bad, and if you allow it, you might never be able to get rid of it. So it makes you wonder why a team would accumulate any of it in the first place?

Believe it or not, technical debt is a tempting proposition for management.  To understand why, let’s first look at the ways to prevent tech debt.  Preventing tech debt, is really just a matter of finding & resolving tech debt very early in the process.  The sooner you resolve it, the lower your chance of accumulating it.  This is accomplished by reviewing code often, and enforcing standards:

  • Developer self-reviews and refactoring
  • Peer reviews
  • Formal code reviews
  • Architecture and system reviews
  • Quarterly and annual upgrades to technology
  • An overall attitude of constant improvement

If you are a manager, you probably read that list and thought, “Wow, that is a list of increasingly lofty goals.” If you were a VP, you probably read the same list and thought, “We can live without all that stuff. Cut it from scope”.

The reason, is because each one takes time. The purpose of the review, is to find things to improve, and by “improve”, I really mean “change”.  Changes mean re-testing, which usually means more bugs and more re-testing.  It adds risk, and management has trouble seeing the risk + reward.  However, they can easily see the risk. After-all, the if the system is already running (as-is), then there must not be a problem, right?  This is why management feels a strong incentive to ignore the reward for resolving tech-debt. Therefore, you can see how [accumulating technical debt], would seem like a tempting proposition. You could spend that money(time) elsewhere.

Reward for the risk

First, to be clear: all of the reviewing in the world, won’t make a bit of difference if you cannot/will-not commit-to resolving the problems that are found during a review.  It is like seeing a house fire, saying “someone should put-out that fire” and then walking away, without doing anything to help.

Every bit of tech debt is a metaphorical fire hazard.  If you lived in a house where some of the electrical work was not up-to-code, you might be willing to live with it.  However, there are building codes for a reason.  Each of those violations is a drop-in-a-bucket, or a straw on a camel’s back.  It may be difficult to discern where you turn the corner from “a few building-code violations” to “this is a death-trap”.  You might not realize it until something bad happens.  When it does, then everyone is suddenly an expert and “knew this would happen”.

Okay, so it is tempting, but dangerous.  Now what?

In reality, your IT management needs to define these policies. Hopefully, they realize this and assign a reasonable tech-debt ceiling (like a credit limit) and “monthly payments” (code-reviews, and monthly tech-debt resolution budget).  If not, then send them my way and have them read a few articles.  Then encourage them to goog/bing it for a 2nd opinion.

Credit companies understand debt and debt management.  They require monthly payments, and have credit limits.  Since monetary debt has many similarities to tech-debt, you may want to consider similar practices.

Posted in IT Psychology, Lessons Learned, planning | Tagged , | Leave a comment