Navigate/Search

The Joy of Anal Text

Here’s a hot tip, boys and girls: never try to normalize your data using text fields as your primary identifier. The computer won’t know that “Some Company, Inc” is supposed to be the same as “Some Company, Inc.” or “Some Company Inc”; likewise, “Department” is not the same as “Dept.” is not the same as “Dept” even if they represent the same entity – and the programmer will hate you when they also have to investigate manually find out that “SomeOtherCompany 2000, Inc” is different from “SomeOtherCompany2000, Inc”.

No, really. They will. Upper-case-H-A-T-E, hate you.

17 Responses to “The Joy of Anal Text”

  1. Brian Dunbar Says:

    Ugh – ratty data. I hate ratty data. Not as much as I hate Illinois Nazis but it’s up there.

    See – for about the last five years I was a minor cog rolling ERP out to the company. Every five months we did a new site and every five months during go-live we’d have ‘issues’ where the data from whatever rattle-trap ERP the site was using would infect JDE and make it act badly.

    Then we’d have a lively finger-pointing exercise when my label printing app would just .. not .. function right. Every time I’d have to spend X hours proving that my system was ‘ok’ it was the data from JDE that was making it sick.

    Every … stinkin’ .. time.

  2. Stephen Says:

    But will they hate you?
    Or is that different from “But will they Hate you?”
    (Or “H8” you, for that matter?)

    (Yeah, that’s a tricky problem. I’ve done a bit of it. Ick.)

  3. Brian Dunbar Says:

    Ugh – ratty data. I hate ratty data. Not as much as I hate Illinois Nazis but it's up there.

    See – for about the last five years I was a minor cog rolling ERP out to the company. Every five months we did a new site and every five months during go-live we'd have 'issues' where the data from whatever rattle-trap ERP the site was using would infect JDE and make it act badly.

    Then we'd have a lively finger-pointing exercise when my label printing app would just .. not .. function right. Every time I'd have to spend X hours proving that my system was 'ok' it was the data from JDE that was making it sick.

    Every … stinkin' .. time.

  4. Stephen Says:

    But will they hate you?
    Or is that different from "But will they Hate you?"
    (Or "H8" you, for that matter?)

    (Yeah, that's a tricky problem. I've done a bit of it. Ick.)

  5. protected static Says:

    Then we’d have a lively finger-pointing exercise when my label printing app would just .. not .. function right. Every time I’d have to spend X hours proving that my system was ‘ok’ it was the data from JDE that was making it sick.

    Every … stinkin’ .. time.

    For years, I’ve threatened to make a laminated sign to keep in my office that says “It’s not the [program/report/query], it’s the data.” It would save so much time – just reach for the sign and flash it at the doorway whenever you hear the right footsteps coming…

    My last job in St. Louis was supporting the land acquisition process for a major public works project. Given that construction (the sexy part!) couldn’t proceed until all the requisite land was acquired, there were a lot of status reports generated on a daily basis. A couple times per month the senior PM would come and ask for a custom report to drill down into some aspect of the process or another; invariably, he would complain that the report was wrong; invariably, it would turn out to be f$cked-up data entered into a system we had no control over.

    Eh. It’s been years since I had to support real-world data collected by someone else’s system. I’ll get over it. I’ve got a full tank of gas, it’s dark, and my sunglasses are on.

    But will they hate you?

    It’ll keep me warm at night ;-)

  6. protected static Says:

    <blockquote>Then we’d have a lively finger-pointing exercise when my label printing app would just .. not .. function right. Every time I’d have to spend X hours proving that my system was ‘ok’ it was the data from JDE that was making it sick.

    Every … stinkin’ .. time.</blockquote>
    For years, I've threatened to make a laminated sign to keep in my office that says "It's not the [program/report/query], it's the data." It would save so much time – just reach for the sign and flash it at the doorway whenever you hear the right footsteps coming…

    My last job in St. Louis was supporting the land acquisition process for a major public works project. Given that construction (the sexy part!) couldn't proceed until all the requisite land was acquired, there were a <em>lot</em> of status reports generated on a daily basis. A couple times per month the senior PM would come and ask for a custom report to drill down into some aspect of the process or another; invariably, he would complain that the report was wrong; invariably, it would turn out to be f$cked-up data entered into a system we had no control over.

    Eh. It's been years since I had to support real-world data collected by someone else's system. I'll get over it. I've got a full tank of gas, it's dark, and my sunglasses are on.

    <blockquote>But will they hate you?</blockquote>
    It'll keep me warm at night ;-)

  7. dean Says:

    Yes, and you wind up with lovely constructions like

    JOIN table_b B
    ON ltrim(rtrim(a.some_col)) LIKE ‘%’ + SUBSTR(b.somecol, 2, 6) + ‘%’
    AND…

    well, you know.

    That’s why I’m a big advocate of surrogate keys. It is nearly impossible for a programmer to abuse the INT IDENTITY form of surrogate key. Ok, I shouldn’t say impossible. Difficult, perhaps.

  8. dean Says:

    I was going to say ‘doh, I just violated the naming convention’ (somecol, some_col) but then I realized that this is probably pretty standard in most companies.

  9. dean Says:

    Yes, and you wind up with lovely constructions like

    JOIN table_b B
    ON ltrim(rtrim(a.some_col)) LIKE '%' SUBSTR(b.somecol, 2, 6) '%'
    AND…

    well, you know.

    That's why I'm a big advocate of surrogate keys. It is nearly impossible for a programmer to abuse the INT IDENTITY form of surrogate key. Ok, I shouldn't say impossible. Difficult, perhaps.

  10. dean Says:

    I was going to say 'doh, I just violated the naming convention' (somecol, some_col) but then I realized that this is probably pretty standard in most companies.

  11. protected static Says:

    Naming convention? Isn’t that where you get a whole bunch of people together in the same place and decide upon as many different names for the same data points as possible?

    Fortunately, the naming convention for this crapfest database isn’t bad.

  12. protected static Says:

    Naming convention? Isn't that where you get a whole bunch of people together in the same place and decide upon as many different names for the same data points as possible?

    Fortunately, the naming convention for this <del>crapfest</del> <ins>database</ins> isn't bad.

  13. Doug Says:

    Hey, a database post even I could understand! Kudos.

    And I loved the title, too.

  14. protected static Says:

    And I loved the title, too.

    Somehow, I figured you would… :-)

  15. Doug Says:

    Hey, a database post even I could understand! Kudos.

    And I loved the title, too.

  16. protected static Says:

    <blockquote>And I loved the title, too.</blockquote>
    Somehow, I figured you would… :-)

  17. Balls and Walnuts - more than you ever wanted to know » Thirteen moves Says:

    […] static: something about sex. I […]

Leave a Reply