< mysql lab 1 > (Originally titled "sql jd" and made by Joshua Dickinson, Updated by Yoon Chung Han) spl data types and description



Download 60.82 Kb.
Date conversion02.05.2016
Size60.82 Kb.
Data Visualization 2013
< mySQL Lab 1 >

(Originally titled "sql_jd" and made by Joshua Dickinson, Updated by Yoon Chung Han)
1. SPL data types and description
1. cout

The date and time at which an item was checked out. This is stored as a MySQL “datetime” type, which means we can perform MySQL functions on it like “date()” or “unix_timestamp()”.
2. cin

The date and time at which an item was returned, also stored as a MySQL “datetime” type.
3. collcode
4. itemtype

This indicates what physical type of item this is, for instance a book, dvd, or cd.
5. barcode

A library specific bar code that is unique to one physical item.
6. title

The item’s title. Obviously multiple items can share the same title, but they’ll have different bar codes and might even have different itemtypes. Like “Ghostbusters” the novel and “Ghostbusters” the VHS.
7. callNumber

A call number is unique to single item within the library catalog. Multiple copies of the same book will share the same call number but will each have different bar codes.
8. deweyClass

Some items have a Dewey Decimal Classification number. Others do not, so often this category is marked NULL. Dewey numbers are pretty useful for understanding the item’s content: “The DDC attempts to organize all knowledge into ten main classes. The ten main classes are each further subdivided into ten divisions, and each division into ten sections, giving ten main classes, 100 divisions and 1000 sections. DDC's advantage in using decimals for its categories allows it to be purely numerical, while the drawback is that the codes are much longer and more difficult to remember as compared to an alphanumeric system.”

http://en.wikipedia.org/wiki/Dewey_Decimal_Classification

http://en.wikipedia.org/wiki/List_of_Dewey_Decimal_classes
9. subj

Subj is like keyword tags that give us an idea about the item’s content. Usually only the first few subject entries are used and the rest are marked NULL.

2. Querying the Data

I. First we need to install a MySQL client so that we can communicate with the server on which the database is stored.
Mac users should install Sequel Pro.

Windows people should install HeidiSQL or alternatively MySQL Workbench.

Linux people should try MySQL Workbench.
There are many other database management programs to pick from but these are the programs that have worked well in the past.
II. Accessing the database
Host: tango.mat.ucsb.edu

User: mat259

Password: V1sual1zat1on

Database: (optional)

Port: 3306
III. Forming a query
Queries are like commands that allow us to access a small subset of the database based on a set of criteria. The entire dataset has something like 60 million entries so we don’t want to look at all of that all at once.
Here’s an example of a simple query:

select * from inraw where title = "catch 22";
In English this means:

show me all the data about entries from the “inraw” set where the title is “catch 22”.
Now we only want to see DVDs:

select * from inraw where title = "catch 22" and itemtype = "acdvd";
In English:

show me all the data about entries from the “inraw” set where the title is “catch 22” and the item is a DVD (adult section).
IV. Forming an interesting query
select * from inraw where title like "%vampire%";
In English:

show me all the data about entries from the “inraw” set where the title has the word “vampire” somewhere inside of it.
The “%” symbols are a regular expression which stand for any number of characters or spaces, so in the example above “vampires” matches and so would “emo vampires” but not “vampiring”.
select * from inraw where title like "%vampire%" or subj like "%vampire%";
In English:

show me all the data about entries from the “inraw” set where the title or any of the subjects has the word “vampire” somewhere inside it.
Parenthesis allow us to string together arbitrarily long sets of criteria:
select * from inraw where (title like "%vampire%" or subj like "%vampire%") AND title like "%blood%" AND (itemtype = "acbk" or itemtype = "acdvd");
In English:

show me all the data about entries from the “inraw” set where the title or any of the subjects has the word “vampire” somewhere inside it AND make sure that the title has “blood” somewhere in it AND make sure that the item type is either an adult book or an adult dvd.”
V. Narrowing a query
So in the previous examples we’re looking at the data from the entire year 2006-2011. If we want to narrow the time frame that we’re searching, we need to add a time criteria into the search.
select * from inraw where (month(cout)>1) AND (month(cout)<5) AND title like "%nosferatu%";
In English:

show me all the data about entries from the “inraw” set that were checked out from February through April AND make sure that the title has “nosferatu” in it.”
Now let’s narrow the data that the query returns, since we want to focus on dates and ignore everything else. We’re going to ask it to return only the cout, cin, and title.
select cout,cin,title from inraw where (month(cout)>1) AND (month(cout)<5) AND title like "%nosferatu%";
In English:

show me the ckeckout date and time, the return date and time, and the title of entries from the “inraw” set that were checked out from February through April AND make sure that the title has “nosferatu” in it.”
Since now we’re specifying exactly what we want back, we can also include bits of code in our requests. Here’s an example of how to find out how many days something was checked out for.
select cout,cin,title,TIMESTAMPDIFF(HOUR,cout,cin)/24.0 from inraw where (month(cout)>1) AND (month(cout)<5) AND title like "%nosferatu%";
Where we say “HOUR” we could also use any MySQL unit type, such as MICROSECOND, SECOND, MONTH, etc. Here’s a list options: http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_date-add
VI. Organizing the results
Maybe we want to see all items that were checked out on a specific day, ordered chronologically by the day they were returned.
select * from inraw where date(cout) = '2011-01-10' order by cin;
In English:

show me all the data about entries from the “inraw” set that were checked out on January 10, 2011 and arrange them in order of the day they were returned



Or if we want to do this in descending order, we can add the descriptor DESC to the end of the query. The default option is ascending, or ASC.
select * from inraw where date(cout) = '2011-01-10' order by cin DESC;
We can also have hierarchical ordering, for instance putting something in order of date and then alphabetically for items that are returned on the same day. Additional ordering criteria are separated by commas.
select date(cin),title from inraw where date(cout) = '2011-01-10' order by date(cin) DESC, title;
VII. Converting natural language (like English) into MySQL queries
In English:

show me the title, itemtype, and ckoutDateTime of all items from the “inraw” set that were checked out on February 20, 2011 with either “samurai” or “ninja” or “sword” or “fighting” in the title and arrange them reverse alphabetically by title.”
solution:

select title, itemtype, cout from inraw where date(cout) = '2011-02-20' and (title like "%samurai%" or title like "%ninja%" or title like "%sword%" or title like "%fighting%") order by title DESC;
In English:

show me the title, itemtype, and duration of checkout period in days of all items from the “inraw” set that were checked out on February 20, 2011 and were kept longer than a month and arrange them by the length of the checkout.”
solution:

select title, itemtype, TIMESTAMPDIFF(DAY,cout,cin) from inraw where date(cout) = '2011-02-20' and TIMESTAMPDIFF(DAY,cout,cin) > 30 order by TIMESTAMPDIFF(DAY,cout,cin) ASC;
In English:

show me everything about items from the “inraw” set that are the itemtype “slides” or “microfilm”.” (Look at the list of item types)
lame solution:

select * from inraw where (itemtype = "acslide" or itemtype = "arslide" or itemtype = "ahmfm" or itemtype = "armfm" or itemtype = "dhmfmnp" or itemtype = "drmfper");
awesome solution:

select * from inraw where (itemtype like "%slide%" or itemtype like "%mf%");




VIII. More advanced queries
select floor(deweyClass/10)*10 as dewey,

SUM(CASE WHEN year(cin) = '2005' THEN 1 ELSE 0 END) as yr2005,

SUM(CASE WHEN year(cin) = '2006' THEN 1 ELSE 0 END) as yr2006,

SUM(CASE WHEN year(cin) = '2007' THEN 1 ELSE 0 END) as yr2007,

SUM(CASE WHEN year(cin) = '2008' THEN 1 ELSE 0 END) as yr2008,

SUM(CASE WHEN year(cin) = '2009' THEN 1 ELSE 0 END) as yr2009,

SUM(CASE WHEN year(cin) = '2010' THEN 1 ELSE 0 END) as yr2010

from inraw

where deweyClass is not null

group by floor(deweyClass/10)*10;
In English:

show me how many items are checked out in years 2005, 2006, 2007, 2008, 2009, and 2010 from the inraw set where the dewey class isn’t NULL and then separate these into dewey decimal groups of 10.
--

select COUNT(*),COUNT(IF(title = "catch 22", 1, NULL)) from inraw;
In English:

Show me the total count and the number of things that have the title “catch 22” from the inraw set.”
IX. Now that you understand how to form different types of queries, try to explore the dataset in order to find interesting patterns. Think about what you want to search for in English and then translate it into code like we’ve seen. This will allow you to keep track of complicated searches and will also help you to explain your process to the rest of the group as we move forward.


The database is protected by copyright ©essaydocs.org 2016
send message

    Main page