July 20, 2010

I’m an MCITP SQL Server 2008 DBA!

Posted in Certification, Learning, SQL Community, SQL Server, SQL University at 10:00 am by David Taylor

Last week, I passed the Microsoft 70-450 exam, PRO: Designing, Optimizing and Maintaining a Database Administrative Solution Using Microsoft SQL Server 2008. I have the SQL Community to thank for it, and I’ll outline why as we go. Mainly, I was asked to write a blog post about how I prepared for the exam, so I will cover what I found helpful, and what I found not so much.

The preparation starts, as with all Microsoft exams, on Microsoft’s Learning site. For this exam, one would start at http://www.microsoft.com/learning/en/us/exam.aspx?ID=70-450, which gives the overview, skills measured, preparation, and community links. The overview gives just that, the 50,000 foot look at the exam and the intended audience, along with what Microsoft thinks should be the set of features and concepts the candidate should have experience in. Good enough. I then skipped right to the Preparation Materials tab, thinking I would, like I did for the 70-432 MCTS exam, order up a Training Kit. How often do you see these two words together? #Microsoft – #Fail!

What I found was the following, copied verbatim from the page:

Preparation Tools and Resources
To help you prepare for this exam, Microsoft Learning recommends that you have hands-on experience with the product and that you use the following training resources. These training resources do not necessarily cover all of the topics listed in the “Skills Measured” tab.

Classroom Training
There is no classroom training currently available.

Microsoft E-Learning
There is no Microsoft E-Learning training currently available.

Microsoft Press Books
There are no Microsoft Press books currently available.

Hmm.

OK, plan B. Apparently Microsoft isn’t interested in training their users for a product that has been on the market for two years, and has actually been superseded by an R2 product! Looks like I needed to find my own way to study for this thing. Luckily, I am a member of the online SQL Server community, and can learn from the best! So, following Buck Woody’s school of thought, I turned to the Skills Measured tab.

If you haven’t seen it, this exam covers a lot. I mean a whole lot! The list of skills measured is actually kind of intimidating, at first. There are seven broad categories of features covered, with innumerable bullet points within each category, which in turn all have multiple objectives. I had taken the MCTS in February, and that was a little difficult, now I had scheduled the MCITP for July 6th, figuring that six months should be ample time to prepare for it.(I know, I know, I missed the Second Shot deadline, but there were reasons.) I was starting to wonder if that were long enough.

The first thing I did was look to see what was out there besides Microsoft’s non-offerings. I had heard good things about Train Signal, and they were having a sale, buy their Advanced SQL Server 2008 Database Administration Training DVD, and get their Windows Server 2008 R2 Training DVD free. Their DVDs do not come cheap, but one thing I had going for me was that getting certified was part of my annual goal at work, so they footed the bill.

So I sat through the 20-some hours of videos, thinking through 60-70 percent of it, ‘I know this stuff! And I am being talked to as if I were in the studio audience at a taping of Barney the Dinosaur!’ Really, it wasn’t that bad, but for something that bills itself as ‘Advanced Training,’ they sure spend a lot of time on the basics! Overall, I would say it’s a decent reference to have around, but I’m not sure it’s worth what they’re charging.

Next, I figured that I should have followed my first instinct, and turned to the community. I asked around a bit, and heard that several people had gone several ways, but the common denominators were Books Online, and Practice.

Always, it’s the basics that works. If you want to lose weight, eat less and exercise. If you want a good retirement, live frugally and save. How often is it that we lose sight of these things?

So for the last few weeks before the exam, that’s what I did. I went down the list of objectives in the Skills Measured list, looked up each term I didn’t already know in BoL, and tried each thing out that I could on a test server (my home computer, really – I don’t have a real ‘test server.’ At the time, (and even this moment) I did not have the proper setups to actually practice or perform any of the multiserver HA stuff, like clustering, or replication. I only have one little server box at work, with a whopping 2GB ram, and my home computer to play with. So really, I just read through everything I could about those technologies, trying to at least make sure I had a proper understanding of the concepts.

Throughout, as I said in the beginning, I had the community of which I am so proud to be a part. Every day folks were writing multiple blog posts, SQL University had a week on Certifications, just all kinds of resources to learn from. Actually, some of the best tidbits came from people’s #sqlhelp questions on Twitter, as they tried to figure out why one or the other of the things listed in the Skills Measured wasn’t working, the answers given were actually found as some of the multiple choices on the exam!

So that’s how I prepared. I found out what the requirements would be, I determined how to learn those requirements, I got a nifty couple of training videos, then I followed the community’s recommendation – I got back to basics. I followed the RTFM method, and practiced all I could. Now, because of all that, you are now reading the words of a Certified Professional.

July 2, 2010

Database Design 101c, #SQLU

Posted in Learning, SQL University at 9:00 am by David Taylor

Welcome Back Once Again!

For today, our last class together at SQL University, (#SQLU on Twitter, the brain child of Jorge Segarra, who blogs at The SQL Update Statement, and can also be found championing the Social side of DBAness as @SQLChicken on Twitter), let’s talk about keys, both primary and foreign, discuss indexing, and finally, how to get information from data. Busy day today, so let’s get into it!

Thanks for my involvement (one more time!)

I would like to once more express my heartfelt thanks to Chancellor Segarra for inviting me to lecture in this esteemed forum. I’d also like to thank Coach Larock for re-emphasizing what I am telling you about here – his lecture was better than all three of mine combined! Read it, if you haven’t already!

Where are my Keys?

Keys are important!

When we talk about keys in a database, we are talking about a couple of different things. Primary keys are those fields in a table which help define and delineate the table, like an order or customer number. These are not repeated in a table because they are used to identify each row of data, and so can be used to index a table, as we will discuss shortly.

Foreign keys are important, too!

Foreign keys are sometimes a tricky concept to understand. Tables are related in what are known as Parent-Child relationships, and it is the keys that define and control that relationship. In my previous lecture, we laid out a data model for Customers and Orders. When we normalized the table, we ended up with one table that had Orders, separated out from the Customers table, as illustrated below.

Orders

Cust# Order# Part# Desc
0001 001012 345 Disc
0002 01453 651 Bowl
0003 01232 345 Disc

 In this table, the Order# field is the Primary key for this table, and the Cust# field is the foreign key. It relates this table back to the Customers table, so that orders can be matched to customers.

Constraints

Another thing Primary and Foreign keys allow besides relationships is a concept known as constraints. When the database is set up properly, the rows of data in both tables are constrained to each other. This means that the relationship between them is such that, for example, you can not delete a customer without first deleting that customer’s orders. Imagine if that constraint was not in place, and someone deleted a customer. The orders would still be out there, with a foreign key pointing to nothing.

Indexes, or how do I find things?

Primary Keys Define Indexes

The Primary keys that you define when constructing a table are generally also used to define the index for the table. Database indexing is a large subject unto itself, in this class we will just go over them briefly. Essentially, an index is how data gets found within a table, much like an index is used in a book to find information.

In a book, you would turn to the index to look up a certain word, and the index, in turn, would point you to the page on which that word was found. A database index works much the same way. When a table is queried, and the query contains a Where clause to limit data, the database engine takes the predicate of that Where clause and looks, or seeks, through the index looking for that predicate. Once found, the rest of the record is also located, and the query returns the data to the user.

Without an index, the database engine would need to search through every row of the table looking for the predicate before it could return the row in question, so you can see how much greater performance will be with an index.

Indexes can be defined as something other than the Primary key, but in general, that’s what the index is based on.

Getting Information from Data

Raw Data is hard to Read

When a database is designed properly, with the correct amount of normalization for the data, in its raw form it is not very intelligible to the human mind. There are keys and tables, bits of data here in this table, other bits in that table, and simply reading through a table will not get you very far if you want to know what’s going on in your business. Properly queried, though, there is real information in your tables for the business to keep tabs on their business, find trends in data, and help the business stay afloat and even grow.

All of this is made possible with the right set of queries to pull from the data. In my previous lecture, I showed how not to build a table, and would like to illustrate here why it shouldn’t be built like that. To remind, here is the obnoxious offender.

Customers

FirstName LastName Address City State ZipCode Order# Part# Desc
John Smith 123 Any St Big City MA 02134 001012 345 Disc
Jane Doe 456 Main St Small City MA 01234 001453 651 Bowl
William Smith 2 Mulberry Ln Small City MA 01234 001232 345 Disc

 I will ask again, for the sake of time, to imagine that there are more orders added to the right side of the table, 3 columns for each, and, further, imagine that a sales manager would like to know how many of what product is being sold to a certain city. The query would have to include lines that take each order column and collate them together into their own column, bringing the city only once for each row, and only then being able to get counts for each city’s orders. This would involve messy and low performing pivots or CASE statements to get the orders separated out, so bad I am not even going to try to right it out here, please just imagine it. And then imagine that this little table is thirty or forty columns wide with millions of records. The manager may as well start the query on Monday and hope for a result by the end of the week!

 In a properly designed database, though, the query would simply get the cities from one table, join to orders in another, possibly tying them together with customers as a binder, and his count of product orders per city could be returned in seconds. And if such data could be returned quickly, other ways of querying can be used also taking mere seconds each to get concepts like trends over time, trends over time and area, customer payment histories, etc.

I realize this has been a very simplified discussion, but (A) I’m up against having to take care of other things, and so must finish this lecture, and (B) this is a 100 level class, so I am just trying to get the concepts relayed to you, with the hope that you will endeavor to take up the ideas and go further with them yourselves in your homework and research on your own.

Wrap it up

Thanks so much for joining me!

This week has been an incredible experience for me being able to share with you all this information. I thank you so much for joining me! It is my sincere hope that you’ve gotten something to take away from it, if even only a single concept. If you have any questions, be sure to post them in the Comments, below, and I will answer them ASAP.

Check this out for homework

For homework, I am going to once again suggest someone *much* smarter than I to reference on the subject of Indexing. I am giving here a 100 level class; the lady to whom I refer can take you as far as you want to go. I speak here of the Incredible Kalen Delaney, who literally wrote the book, Microsoft® SQL Server® 2008 Internals, the definitive guide to the internals and architecture of the Microsoft SQL Server 2008 relational database engine. She can be found at http://www.sqlserverinternals.com/, and her blog is at http://sqlblog.com/blogs/kalen_delaney/default.aspx. I would simply suggest reading everything there, you won’t be sorry!

Thank you all for joining me this week, I hope I was able to teach you something you can use.

June 30, 2010

Database Design 101b, #SQLU

Posted in Learning, SQL University at 9:00 am by David Taylor

Welcome Back!

For today’s class at SQL University, (#SQLU on Twitter, the brain child of Jorge Segarra, who blogs at The SQL Update Statement, and can also be found championing the Social side of DBAness as @SQLChicken on Twitter), we’ll be delving a little bit deeper into gathering requirements, we’ll determine what our database will look like based on those requirements, discuss normalization of data and touch on a couple of different ways databases can be built.

Thanks for my involvement (again)

I know it sounds like I’m sucking up (and maybe I am, I’m not telling!) but I would like to once more express my heartfelt thanks to Chancellor Segarra for inviting me to lecture in this esteemed forum. It can’t be said enough what a great opportunity this is, and I am greatly appreciative.

Business Requirements

Why are they important?

There are many ways to build any database, but they all start with one question – what data will they hold? This is determined by the requirements of the business. In a lot of cases a business will live or die on its data accuracy and completeness, which should illustrate its importance. I know I am sounding like a parrot, repeating the words “business requirement” over and over, so let’s take this time to define what I mean.

What are they?

Business requirements simply means this: What data does the business need in its day to day operations? Taking some examples from my last lecture, if we’re designing for a financial institution, then our database will be holding customer names and other identifying information, financial transactions and the like, including information dealing with the institution itself, and how all the customer transactions affect the overall business. If we are designing for some sort of retail outlet, we again will probably have customer identifying information, along with their orders and payment methods, and we’ll probably also have some part of the database devoted to keeping track of the business’ stock and supplies.

In the initial stages of design, the database designer should sit down with the business’ management, and ask what sorts of data they expect to deal with. Some parts of it, as I mention above, will be obvious and quick to go over, some not so much. In the retail scenario, you may run across details such as the warehouse location system, the different methods of shipping, etc. Asking for further and deeper detail will make the design process go easier, as it is better to have all of the entities (defined as pieces of data, such as customer names and addresses, or warehouse locations) to be recorded defined at the outset.

What managers/customers want vs. what they need

I used this heading in my last lecture, but it bears repeating. Managers and customers have an idea what they want, but they don’t know what they need! A retail manager will tell you that the business needs to know a customer’s address, this is obvious. What may not be so obvious is that customers can, and do, have more than one address. In addition to possibly separate billing and shipping addresses, customers can have more than one address to ship to; they may be shipping gifts to other people, etc. This means that we need to take into account multiple addresses for each customer. This is only one example, but it points out that only a very thorough and detailed conversation about the business’ use of data will yield what their data needs are, and the data professional building the database needs to be savvy enough to understand the underlying data needs.

Requirements Gathered, Now What?

The Entity-Relationship Model

That was a mouthful, what is that?!?!? Entities are groups, or clusters, of information, such as the group of information that defines customer identifying information, names, addresses, etc. As we’ll discuss in the normalization section, how different entities relate to one another is the relationship part of that tongue twister, for example, the orders that a customer makes – orders and customers are related to each other so the business can tie them together. Finally, the Model part of the heading refers to how the database is initially conceived – as a model, not necessarily a true database.

Database modeling is the act of defining the entities, filling those entities with attributes; that is pieces of data that help define the whole, such as first name, last name of a customer, then finally relating those entities to each other. It is at this point in the process that database design tools come into the picture. You’ll remember from the first lecture that I stated that these tools can run the gamut from a napkin and pencil to very expensive, complicated software that can model the data then actually build the database for you!

The Chicken and the Egg

I got stuck at this point in writing this lecture, unsure whether to dive into Modeling tools, or Normalization first. The two sort of go hand in hand. The tools build the relationships; the relationships are determined by the level of normalization. It was when I finally wrote that last sentence that I realized we need to discuss normalization first, tools later. So…

Normalization can’t be stressed enough

What is this Normalization?

Normalization is the result of normalizing data. Clear as mud, right? By normalizing data, we remove repetition from within the data. Let’s work through an example. If I need an entity that describes a customer, what should that entity contain? At minimum, it should have at least the following columns, or fields, of data: First Name, Last Name, Address, City, State, and Zip Code. I’m in the US, so for purposes of discussion, I will ignore other country’s systems of address postal boxes or residences. Further, there may be more detail required, but these fields will suffice for discussion.

Now, an entity, or table, with the above fields, looks on the surface to be perfectly valid, but let’s add some data to it:

Customers

FirstName LastName Address City State ZipCode
John Smith 123 Any St Big City MA 02134
Jane Doe 456 Main St Small City MA 01234
William Smith 2 Mulberry Ln Small City MA 01234

 This is, obviously, a lame data set, but it’s enough to illustrate the point. Look at the repeat data within the data set. People with the same last name, people in the same city. While it doesn’t look like much here, how much repetition will there be with thousands of customers? And, what if there are orders, in the same data? Look at this abysmal example of an entity:

Customers

FirstName LastName Address City State ZipCode Order# Part# Desc
John Smith 123 Any St Big City MA 02134 001012 345 Disc
Jane Doe 456 Main St Small City MA 01234 001453 651 Bowl
William Smith 2 Mulberry Ln Small City MA 01234 001232 345 Disc

 Again, lame, just follow me. What do we do if there are more orders than one for a given customer? Add more columns? How do we go about finding those extra orders later? And can we use this type of system to keep up with inventory? I hope you see the answer to these questions.

The answer is normalization, and the way to do it is through what are known as keys. Keys are like codes that define different parts of the data, allowing us to match data in one entity to data in another by way of relationships. The quickest way to explain is by illustration, and I will illustrate the concept by showing you the result of normalizing this set of data into the following related entities:

Customers

Cust# FirstName LastName Address
0001 John Smith 123 Any St
0002 Jane Doe 456 Main St
0003 William Smith 2 Mulberry Ln

 CustomerCities

Cust# City#
0001 001
0002 002
0003 002

Cities

City# City State ZipCode
001 Big City MA 02134
002 Small City MA 01234

 Orders

Cust# Order# Part# Desc
0001 001012 345 Disc
0002 01453 651 Bowl
0003 01232 345 Disc

 It looks like more data than we started with, and it is, for this small data set. It also can be normalized further, as I will discuss in a moment. But you can already see the advantage of this system as the data set grows and the data gets further normalized. There will be fewer repeated rows of information, as illustrated in the Cities entity, and, once the orders entity is normalized to separate out part numbers and descriptions into their own entities, even less.

The Normal Forms

I will just touch on this subject, and allow the student to follow up – in fact, one of my fellow professors will be in this week with a more detailed article on normalization, in more fact, I am honored to be partnered with this particular professor this week – I will leave his identity a surprise. For now, though, realize that there is what is known as different Normal Forms, called First Normal Form (1NF), Second Normal Form (2NF), etc., right on up to Sixth Normal Form (6NF).

E. F. Codd, the inventor of the relational database, came up with the set of rules to define how data should be organized within a database, these are the Normal Forms. Essentially, each hierarchy of Forms reduces repetition further. The Normalization illustrated above is really only 1NF. Breaking out the cities into their own table with their own keys starts into 2NF. As you go up through the Forms, at the top you are left with only two columns per entity, the relational key and the data, such as a City# (key) and City (data). While it looks strange at first glance, the more Normal your database, the easier it is to query.

Data Modeling Tools

Now that we have some understanding of Normalization, what tools are available to us to build a Normalized Database?

The Napkin and Pencil

I love the fact that this set of tools is valid for building databases – it emphasizes that databases are, at their base, intuitive things. What we’ve done above is really a napkin and pencil database design, even though it was built in a word processor. It was just a quick jotting of what data is needed, and then a breaking down, or normalization, of that data. The same can be done with pencil and paper. A bit more time consuming than some other ways, but still valid.

The Big Guns – Professional Data Modeling Software

There are quite a few commercial and free data modeling software packages available to the designer. These can take the input of data needs and visually design the database on screen, ending in actually creating the Tables, Primary and Foreign keys and other objects in a ready to use database. Prices range from nothing all the way up to prices so high websites won’t divulge them – you need to contact the company’s “Sales Engineer” to find out how much. I say, if you need to ask how much, you probably can’t afford it!

An Option, and a Caution

Within SQL Server is a Database Diagram tool, within which you can lay out tables and keys, and relate them to each other. This is an option for the intrepid designer, which Microsoft obviously thought, as they include it in the package. There is one caution you must remember if you use it, though – it builds the database as you diagram it, while you diagram it! This means that SQL Server is changing and rearranging your database as you make changes to it, and if this is a production system you are doing it on, your overall server performance will suffer. Use it if you will, but be warned of the consequences.

Wrap it up

What we’ll cover next time

For our next class, we will discuss creating the physical database (so far we have only really talked about building a model of the database), we’ll talk about keys a bit more in depth, and their less thought of brothers, indexes, and we’ll talk about some methods of getting information from data, to illustrate why the design is important, something I said I would do today, but realized I needed to cover a few more subjects before we got there..

Check this out for homework

For homework, I am going to go out on a limb and suggest someone *much* smarter than I to reference on the subject of Database Design. I am giving here a 100 level class; the gentleman to whom I refer can take you as far as you want to go. I speak here of the Great Buck Woody, who is a SQL Server MVP and is currently a SQL Server Technical Specialist at Microsoft, and point you to his starting article entitled Database Design: Requirements, Entities, and Attributes. There are links at the top and bottom of the article to go Next and Back, I would suggest you use them.

Until next class, then, enjoy yourselves.

June 28, 2010

Database Design 101a, #SQLU

Posted in Learning, SQL University at 9:00 am by David Taylor

What is #SQLU

SQL University, known as #SQLU on Twitter, is the brain child of Jorge Segarra, who blogs at The SQL Update Statement, and can also be found championing the Social side of DBAness as @SQLChicken on Twitter. The community owes Jorge a deep debt of gratitude for making #SQLU possible. The gathering and dissemination of knowledge is one of the highest achievements to which a community can aspire, and it takes great leadership to do such, which is why, I am sure, Jorge has been nominated for the prestigious Exceptional DBA Award.

Thanks for allowing my involvement

I would like to take this opportunity to express my heartfelt thanks to Chancellor Segarra for inviting me to lecture in this esteemed forum. It is said that everybody with some knowledge is capable of sharing that knowledge; I hope I can do this University the justice it deserves.

Welcome back to Monday, I hope you enjoyed the weekend

I know Professor McCown left you Friday with a lot to think about regarding women in technology. I hope you found time in your busy lives to take initiative with any women you know that are teetering on the edge to get them into this field, and will always keep that in mind.

This has been a busy semester for you all, and I know there has been a lot covered. What I’d like you to focus on this week might be construed as a bit more basic than some of the technical lessons you have received, but I feel that getting the fundamentals right will do worlds of good when those higher steps are taken.

To that end, this week’s class at SQL University, we are going to discuss database design.

Plan for the week

My plan for this week is to start with a high level overview of database design, then discuss further aspects of design, such as why it is important, who will be using it, how to implement it, what tools are available to help the designer, and what the consequences of both good and bad design, or even no design could be.

There will be no test at the end; your own work will show you how well you have absorbed the lessons. There will be homework every day of class, and keep in mind that Coach Larock will probably also have work for you to do.

Database design is a complex subject, but, as I have mentioned, getting it right is fundamental to your database being usable, having good performance, being scalable, and, above all, being pertinent to the customer.

The 50,000 foot view of DB Design

Why do we care

We data professionals need to care about the design of the databases under our care for multiple reasons. First and foremost, we make our livings from these entities, and we must justify our earnings. Also, the majority of us are charged with the maintenance of the databases we work with, and good design facilitates easier maintenance. Finally for this discussion, we all hope the businesses that entrust their data to us will grow, and the databases underlying that business will very likely grow along with it. A well designed database can handle multi-fold growth in stride – correct steps taken at the outset will allow later growth to be unimpeded.

What managers/customers want vs. what they need

It is a truism that managers and customers have an idea what they want, but they don’t know what they need until what they want has been in implementation for some period, and they can see how things work. The database designer needs to know at the outset the end user’s business needs, rather than what the customer thinks she wants, because it’s what is needed that matters. We will spend a fair amount of timing discussing how to determine those needs when we get into discussing Business Requirements.

What tools can help us out

There are several tools available, both free and not, for assisting with database design. Some are better than others, and, as in any aspect of software design, there are defacto standard tools out there. These tools can range all the way from the back of a napkin and a pencil to a full blown ERM (Entity Relationship Model) design software tool. We will briefly touch on several in this range as time allows throughout the week.

Why be Normal?

Normal in this case does not refer to where one stands in society! Rather I am talking here about data normalization. This is a vast subject unto itself, with a great number of great minds having chimed in on the subject over the years. We will be discussing the basics of normalization, the different normal forms and when one would choose the different forms as the database gets designed. Normalization is data simplification, an elimination of repetition of data within the data store.

Does it fly?

Again, I am alluding to a concept here. Databases can scale to incredible heights. Transactions per second can be in the thousands, as can the number of concurrent users. Good design will allow the database engine to handle such large numbers of transactions while not so good designs can cause such conditions to bog down servers irrevocably. This is probably one of the most crucial reasons for learning good design.

Why worry about Database Design

They’re just Excel tables inside SQL Server aren’t they?

As mentioned, most managers and business data owners think they know what they want. Just about anyone in the head office can open a copy of Excel and start typing in a collection of information and generate useful reports from that information, all within Excel. This causes those same people to think that what is deep inside SQL Server is of the same nature, that the data professional is simply building spreadsheets for users to store data from their various forms.

If you’ve not yet been exposed to good database design, you may think the same thing. I will make a confession, here, when I started out is SQL Server, I did the same. Each group of pieces of information was contained in single tables, and I thought the power of SQL Server was only that it had great facility to slice and dice the data. What I found was that it was very difficult to write queries against such tables to obtain the necessary consolidation of data into useable information, and further, I found that such arcane and convoluted queries suffered terrible performance.

Performance

Which is the reason I gave performance its own heading. SQL Server is fast. Unbelievably, almost magically, fast. If what you are doing with it is designed correctly. To give a concrete example, one of my own business tasks was to generate a report of yields from a manufacturing process. In the beginning, with my ‘Excel tables’ and convoluted queries, it took almost two minutes for the report to be rendered each time it was opened. The management team relying on the report was not happy, to say the least. No one wants to stare at a spinning indicator for two minutes, waiting for a daily report to appear.

I learned, then employed, the concepts I will be teaching this week, and found that a properly designed database yielded simple querying techniques which reduced those two minutes to sub-second performance. Because of that (and other things), I remain employed.

Business Requirements can’t be stressed enough

Performance, normalization, querying techniques – none of these matter if requirements are not met. Remember that the business’ data in many cases *IS* the business, and that without it the business would at best suffer, and at worst, cease to exist. The data within the database needs to be understood by the data professional, in enough detail that proper tables can be constructed to house the data efficiently, and that querying devices such as views, stored procedures, etc. can be easily built to gain real information from the raw data to assist the business owners and employees in the success of the business.

Talking to customers to get the requirements

The only real way the data professional can understand the business and its data sufficiently is to simply talk with the customers. The customers, in this case, means any consumer of the data, be they someone that hired you as a consultant, or they are simply higher up in the chain of command within the business. Meeting with the producers and users of the data to gain insight into what the data is and how it will be used is often the best and sometimes the only way of gaining enough understanding to properly design the database.

Understanding the business

Understanding the business is the next level of detail required. Along with tables, queries and the like, the database designer needs to also consider such concepts as security access and data usage to answer such questions as who can see and interact with the data, and what parts of the data will be used for which parts of the business? Again, these are parts of the Business Requirements which can be gleaned and understood from meeting with and talking to the customers and managers and users of the data.

Understanding the data

A corollary to understanding the business is to simply understand the data itself. Is it a collection of financial data, as in a bank or mortgage loan business? Is it orders and customers, as would be found in a retail business? These types of data are vastly different both in their collection and their usage, and understanding, at least in general, if not in detail, the data that will make up the content of the database to be designed will end in a vastly better database than one which was designed without that knowledge.

Wrap it up

What we’ll cover next time

For our next class, we will delve into actually building a (simple) database, based on requirements that I will describe. We’ll build it two ways, one bad and one better, so you can learn to discern the difference. We’ll cover normalization in a bit more depth, and I will describe some different ways of building the database, utilizing a couple of the different tools available. We’ll discuss also some methods of getting information from data, to illustrate why the design is important.

Check this out for homework

In the meantime, it will be helpful for you to learn what not to do, before we even start to do anything. To that end, I would like you to read a piece by  Dr. SQL himself, Louis Davidson, author of SQL Server 2005 Database Design and Optimization. The specific blog post I would like you to read is entitled Ten Common Database Design Mistakes. There is a wealth of information to be had here, and I would like you to have this information in mind as we discuss the proper design of a database.

Until next class, then, enjoy yourselves.

June 3, 2010

I’m a #SQLU Faculty Member!

Posted in Learning, SQL Server, SQL University at 10:00 am by David Taylor

I read a blog today, about blogging, and one of the takeaways I got was that spending words apologizing for not blogging (or not blogging often enough) is a waste of your time and mine, so I will stop wasting your time.

This blog’s about learning and it’s about SQL and about learning SQL. They say the best way to learn is by teaching (See Buck Woody’s post about this). In that vein, I have taken on a teaching position. Maybe a small one, in the larger realm of things, but you gotta start somewhere.

I will be taking a week of lessons as a faculty member of good old #SQLU, SQL University, that awesomeness created by Jorge Segarra, better known in the Twitter world as @sqlchicken. My week will be all about Database Design, and, if you saw the databases I manage, you would cry, but it’s been a great learning opportunity, so I feel qualified.

So if you haven’t been following #SQLU, get over there and get your learn on! This semester has had some of the greats of the SQL World, with subjects so far covering Documentation, Professional Development, Security and Auditing, Tools of the Trade, SSIS Design and Development, PowerShell, Parallelism and Testing and Refactoring. Some heady subjects, lots to learn. And don’t forget to also get over to Thomas Larock’s (@SQLRockstar on Twitter) blog for the weekly Basketball practices, where he teaches you to use the knowledge you are gaining.

I look forward to serving you, the students of #SQLU, and working with the Faculty already in place. And maybe I’ll even get me a #SQLU Faculty Member Tee shirt to go along with my #SQLU Alumni shirt!

Follow

Get every new post delivered to your Inbox.