... Some malfunctions in SEVIS have distorted students' files, mysteriously replacing or deleting various pieces of information. Students can be deported if their records aren't in order. Already, some have been held overnight in jails or kept from re-entering the United States because their records were misplaced or because the system garbled them. ... "Oh, you're not going to believe what it's doing now," said Catheryn Cotten, the director of the International Office at Duke University, to the SEVIS help desk technician on the phone as she stared in amazement at her computer screen. Cotten had logged into the system to work on some foreign exchange students' records. But when she clicked a link to access them, SEVIS told her there were 1,800 exchange students still awaiting entry. Cotten was stunned. "I don't have that many [full-time foreign students] at Duke, let alone in my exchange program," she says, recounting the event. Somehow, Cotten had accessed the records of a summer camp program at another school that trains foreigners to be counselors. The help desk operator wasn't surprised. This had happened before; so many times, in fact, that the glitch had been given a name: "bleeding." Several schools have seen their student records mixed up with those of other institutions in the past few months. A student record from Michigan State University, for example, was routed to a printer at a college in Arizona. In other instances, pieces of data are jumbled. Officials at the University of Texas in Austin reported that a student’s last name was listed as an address, one that didn’t belong to any student in the school’s records. http://www.govexec.com/dailyfed/0703/070303h1.htm26 responses total.
hm.. i think a couple of guys from my old job probably wrote that piece of shit /if the story is true, of course/
Heh. My current job consists mainly of converting data from a wide variety of database sources into a format compatible with my company's database. I get to look at raw data in lots of database schema. I'm constantly amazed by just how screwed up these databases can get, even when the design is fundamentally sound (rare), simply by the garbage that users put in them. Thank god I don't have any important parts of my life controlled by databases. </sarcasm>
I've had similar jobs, where some client comes with a database that they have been maintaining manually with something like Microsoft Access and they want to put it into an on-line database. I've never once had a database that wasn't corrupt to the point of being virtually useless. Mostly it ends up being a job that has to be done manually, line-by-line.
Timely item. I was just talking with a non-profit organization's staff about long range planning regarding their database. Right now, they are running a FileMaker database with about 30,000 records, being served from a machine running Mac OS X Server and optimized as a database server. My primary suggestion was that they run two development processes concurrently. One process is the ongoing implementation of small scale changes which improve the functionality of the current system on an as- needed and as-available-to-do-it basis. The other process is the setting aside of funds on an annual basis for development and implementation of a whole new system designed to meet their worst-case (heaviest) usage needs for at least five years beyond implementation. The second process requires some research in order to determine several things: precise needs staff have for database functionality; the differences between those needs and the needs the current system can meet; what products best suit their circumstances and identified needs -- with room to grow -- and; what costs are likely to be required to employ those products. Can some of you with expertise in database development and implementation suggest resources for the latter elements of this research? Key considerations: The office desktops currently run Mac OS 9, and probably will not be changing that for at least a year or two. When they do change, the likely next step will be Mac OS 10.x (x = whatever is current at the time). The database server will not likely change to a different OS, just upgrade to newer versions of the current one.
I had to move an Access database to a MySQL server, a while back. It was the company's parts database, used in generating bills of material for construction jobs. It was a mess...they'd been running it as an Access shared file data base for years until it finally got to the point where it would become corrupted within hours of being restored from backup and couldn't be repaired in a way that would last. Fortunately it was a fairly simple database structure, and I was able to salvage most of the data from the backup copy without too much manual work. The hard part was sorting out a bunch of duplicate part numbers that had somehow sneaked in, even though Access was set to disallow them. Oh, and reworking all the forms and reports, since Access allows identifiers with spaces and MySQL doesn't.
The insidious data corruption problem is the biggest reason I'm glad I'm not using Microsoft Access.
The specific problems I've mostly had resulted from the fact that they wanted to change the structure of the database. For example, you'd have a list of contacts - names of people at various companies or government agencies with addresses, phone numbers and fax numbers. This would be set up as a single table, one row for each person, with most of the fields being just plain text fields. Now they want to merge it into a more "grown up" system that is used for lots of other things too. You might have a separate tables for companies, and people would be linked to their companies. Simple enough in theory. In practice when you look at the database, you find that hardly any company name was spelled the same twice. So "Greenfield", "Greenfield Industries", "Greenfield Ind.", "Greenfield Industries, Inc." and "GI" are all in the database, and by the way, did you know that "RTW" and "Kennametal Industrial Products Group" are the same company as "Greenfield Industries"? There are multiple duplicates for what appear to be the same person with different contact info. Addresses give the state name as "MI" "Mich" "Mich." "Michigan" or "Micigan". Phone numbers are formated in dozens of different ways, some missing area codes. This isn't database corruption in the technical sense - the database software is doing a perfectly good job of keeping the data that was put into it. But it makes for a substantially less useful database. You can't readily call up all contacts for one company, or all contacts in one state. There is no way to clean up a database like that short of having some person go through manaully and fix up each record. Some of it can be done by any random dweeb with common sense, but much would be better done by someone with a lot of knowledge of the folks in the database - which usually means someone who's time is scarce and expensive. Really, the way to have a good database is to start enforcing as much consistancy on it as you can from the get go. But that never happens in the real world. When organization are getting going they have other things on their minds.
This response has been erased.
This response has been erased.
This response has been erased.
It's been a long time since I read the definitions, but ISTR that most real world databases stick to level 3 or 4 normalization. Higher than that is just more PITA than it's worth. SQL is a query language. There is an ANSI standard. Like all languages, it has as many dialects as implementations. MS Access uses a dialect called MS Jet SQL. It has some interesting features, and some pretty lame features, but it generally seems pretty feature-complete to me. I assume Tod is talking about either MySQL or MS SQL Server, both of which use substantially different dialects of SQL. Jan is dead on target about the practical problems that plague databases. I'd also add the fact taht as databases evolve and as employees come and go, the way they get used changes substantially. Maiden names are an example; for some reason people always want to know what a woman's maiden name is but never put a field in their schema for it. So one user might enter it as part of their last name, like "Jones(Smith)". Someone else might just stuff it into the middle name field. A third person might come along and get fed up enough to add a maiden name field to some table, and start using it, but never get around to telling the first two people.
The big problem I see is that databases aren't properly engineered
from the start, and in order to fix that problem, they aren't re-engineered,
but rather patched.
Yeah, that's certainly a problem, but it's not quite the same problem as the fact that users are more interested in having their data be convenient for them than in whether or not it's in the correct field in the right table. There's no engineering process in the world that can deal with the fact that if Joe User wants to see the Part Number on the Inventory screen instead of clicking through to the Parts Definition screen, he'll cheerfully get used to typing it into the Part Name field instead of where it belongs, rather than submitting a change request and waiting some indefinite length of time for it to be "fixed". Maybe, if he's consciencious, he'll type it into both fields, but he'll probably spell it wrong 10% of the time.
Pre-entry validation works pretty well for that, assuming the numbers
are sufficiently different in scope.
An MS Access database can be fairly easily migrated to MySQL by converting the Access DB into a comma delimited file from within MS Access. Once you have a comma delimited text file,(pun) you can use a text editor to remove pesky space problems without risking damage to the original DB. Once these steps are completed, you can import the comma delimited file into MySQL using an import command: "shell> mysqlimport [options] database textfile1 [textfile2 ...]" This command can be used for each table in your new MySQL DB. If running Mac OS X, MySQL,Apache2 and PHP can be installed and configured with minor head aches. Once these components are in place, drop a *.php file containing "<php? phpinfo() ?> into the root directory of the Mac OS X webserver and call the file up in a web browser. If you get a long purple page with the operating system finger print,MySQL info,Apache2 installation details etc, your on your way. PHP allows for rapid development of web page interfaces to MySQL and other DB's. You can serve up your DB to users on the lan, localhost, or the internet via Apache2. If your in a big hurry, you can use several "form generators",written in PHP to create test forms that look nasty and allow you to test and develop further without having to recode a bunch. I used these methods to create a MySQL DB containing 485,000 / 20+ table entries with funky old FAA MS Access garbage DB's. The end result was up and serving the web in one day. I apologize to all for the rip and slash posting...I gotta run...If you want help, I'm bored and you can email me on Grex. I dig Grex.
I suspect the biggest problem is the one mentioned a couple times above, the lack of data authentication on the original ad-hoc application. A good system will have a separate table for "company", and when you add a new contact you then select a company from a drop-down or something. Otherwise there's little or no control over what sorts of gibberish people might enter under the "company" field.
If any of you are Access experts, can you tell me how to change the behavior of the wheel on the mouse, either within a specific form or within Access itself?
Wrong item, Saul.
Re #6: Access is sneaky that way. It works all right as a shared file database, using page locking, until you get more than a few people using it concurrently. Then you get people getting locked out, and data corruption. The sneaky part is this means it fails at right about the time it starts to become really vital to the company. The idea is that you'll then have to go out and buy Microsoft SQL Server as a back end. Re #7: Oh, I get it. Our Commence RM sales database has the same data consistancy problems. Some day it will have to be dealt with, but I hate to think about it. It's also an absolute horror from a normalization standpoint -- for example, whoever designed it put in not only a "first name" and "last name" field, but also a "last,first" field. Then they made the "last,first" field the "name" field for the category (somewhat analogous to a primary key in an SQL database.) This isn't an issue for manual entry, since the form fills out that field after you enter the last name and first name, but it's a real pain when they ask me to import stuff from CSV files. I have to write a script to create a file with the extra field in it. Re #10: We're using Access as the front end, still, with linked tables to the MySQL server. That has its own painful aspects but it avoids retraining our users and lets us keep the same functionality. Eventually we may migrate to a web-based front end, but it's unlikely because it'd be slower and more awkward for the users.
This response has been erased.
Fortunately, it sounds like none of those databases had any binary
objects in them, that's where it gets really hairy, combined with things like
PHP's inclusion of a pg_escape_bytea function, but not a pg_unescape_bytea
function.
re 17: Dealing with the mouse wheel in Access seems to be hard. Here's a couple of links; googling gives plenty more. http://support.microsoft.com/search/preview.aspx?scid=kb;en-us;Q278379 http://www.worship.co.za/BlackandWhiteInc/MouseWheel.htm The second one seems more promising.
Cool, thanks!
(I *love* drift that basically says "Microsoft is for losers".)
Being an absolute beginner in all of the languages I am attempting to learn including *DB* syntax and scripting,one reality always appears at the worst possible moment,"coding standards".Standardization seems to be a dirty word to big proprietary software companies.The evidence being in their careless cretion of their "own standards" for everything they build.As said above,this behavior leads to the purchase of additional products that will supposedly fix a problem or limitation in previous products or releases.The almighty dollar has spoken in the Microsoft antitrust cases now settled.One penalty being that Microsoft must give away hardware and software to schools at reduced cost or for "free".This "penalty" will create even more "need" for MS products and further a bigger MS monopoly. Perhaps because of the continuous migration of "hard copy" information to electronic data dump,we should consider better standards for electing lawmakers.Lawmakers with the capacity for and interest in these electronic solutions we are placing such faith in.Maybe a special "technology awareness test"?I feel as though I might sleep a bit better at night knowing that nobody will be passing a new "superbill" limiting our right to communicate or read words. I suppose boycotting and refusing to work on projects requiring the use of certain products is the only recourse.That may not make a difference either. Oh well *DB to *DB fixer-upper-tools....big money comming right up.
http://c:\aux I rest my case.
You have several choices: