Showing posts with label mysql. Show all posts
Showing posts with label mysql. Show all posts

Sunday, November 7, 2010

Select Odd Rows in MySQL

Any time you can use a bitwise operator in place of a function call, it's a good thing. Selecting odd rows from a table in MySQL that have an auto_increment primary key is no exception.

SELECT * FROM mytable WHERE id & 1;


If you're unfamiliar with a bitwise AND, basically what it says is return true of the left operand has the same bits set as the right operand. The number 1 only has one bit set, and conveniently enough that one bit is the same bit that will only be present in odd numbers.

Sunday, September 27, 2009

PHP MySQLi bind_result Empty LONGTEXT

The problem is that when using the PHP MySQLi extension with prepared statements, I was getting an empty value in one of my bound variables after using $mysqli->execute(), $stmt->bind_param(), and $stmt->fetch(). If I executed the exact same query using $mysqli->query() and subsequently using $result->fetch_object() I would get the values I was looking for.

I went as far as copying and pasting the variable names associated with the column from every place they were used in the script and comparing them to make sure there were no typos.

Now, if I leave everything exactly how it is, I can toggle this behavior by switching the data type of the problematic column between TEXT and LONGTEXT. When the column is LONGTEXT then the prepared statement returns an empty string to the bound variable. If the column is TEXT then everything works as expected.

Strange behavior to say the least. I've caught a few unresolved PHP bug reports that appear to be related to this when looking for PHP MySQLi LONGTEXT, but nothing too detailed.

In any event, we'll be able to work with a TEXT column instead of using a LONGTEXT column in this application. You should probably research the impart if could have on you before switching though. Some of the older bug reports appear to mention Wordpress users struggling with this when reading from the Wordpress database for plugins.

Tuesday, August 18, 2009

ORDER BY RAND Alternative

If you're reading this chances are you know everything you need to know about why using ORDER BY RAND() in MySQL is not such a good idea so I'll skip all of that and get right to the alternative.

Add a new column to your database table called rand_id and make it large enough to hold a UNIX timestamp. The index is extremely important. This table modification may take awhile if you have a large table.

ALTER TABLE my_table ADD `rand_id` INT NOT NULL ,
ADD INDEX(rand_id)


With the new column in place, setup a cron job which executes an SQL statement to update that column once a day (or more often if you wish). What this does is give each row a random ordering ID starting with the current time plus a random portion of the seconds in a day.

UPDATE my_table
SET rand_id = (UNIX_TIMESTAMP() + (RAND() * 86400))


To fetch a random row you use a query like this.

SELECT *
FROM my_table
WHERE rand_id > UNIX_TIMESTAMP()
LIMIT 1


Here's a few numbers from a test table with 1,572,851 rows. The server used isn't exactly optimized to be a database server either.

Selecting a random row using ORDER BY RAND() takes 37 seconds.
Executing the query to update the rand_id column takes 110 seconds.
Selecting a random row after the rand_id column is updated takes 0.0012 seconds.

Now if we needed a random row every second of every day, we would need 86,400 rows.

86,400 * 37 = 3,196,800 seconds spent selecting random rows. It would take more than a month to select a days worth or rows using ORDER BY RAND().

86,400 * 0.0012 = 103.68 seconds spent fetching random rows. If we add the 110 seconds needed to generate the rand_id column values that brings us to 213.68 seconds. That's less than 5 minutes spent fetching a days worth of random rows. Huge improvement over the ORDER BY RAND option.

Now, this solution can have a few serious drawbacks depending on your needs and table size.

The longer it has been since the rand_id column was generated, the more predictable the random row will be since you can make a note of the rows that have been selected throughout the day and narrow things down considerably.

Tables with less than roughly 90,000 rows are in danger of producing the same result multiple times in a row for seconds to minutes at a time.

Friday, September 5, 2008

Web File Formats 101

These are file extensions or formats you're likely to come across at one point or another if you're going to get into designing web sites. They're split into common groups, include a brief description of where you're likely to see or use them, and clicking the *.EXTENSION link in the titles will take you to a Wikipedia entry with a lot more details on that file format.

This list is a collection of about 45 file formats I've used enough times to remember when presented with a huge list of file formats after roughly five years of experience.




Browser File Formats

Browser file formats are files that instruct web browsers to do something directly. Many technologies may work with browser file formats, but browsers are generally the things that make use of them.

*.AS Adobe/Macromedia Action script file

AS files are generally used as library or optional include files in Flash or Flex applications.

*.CSS Cascading Style Sheet file

CSS files contain CSS rules which web browsers use to decide what web pages should look like.

*.HTM / *.HTML Hyper Text Markup Language

HTML files are web pages. HTML is what files such as PHP, ASP, CFML, and JSP produce when they're executed.

*.ICO Icon file

ICO files are commonly used as Favicons (that little icon in the address bar)

*.JS Javascript file

JS files are used to store Javascript. Javascript is a cornerstone in many of the applications on the web today. Javascript provides a way for web developers to utilize the browsers rendering engine and make changes to the content of a web page without needing to make a trip back to the web server to generate a new page with the updated content. Javascript is becoming more popular for producing animation in web pages since the jQuery, Prototype, and Moo-Tools Javascript libraries were developed.

*.SWF Adobe/Macromedia Shockwave Flash file

SWF files are generally used in Rich Internet Applications and require the Adobe/Macromedia Flash Player to view.

*.VBS Microsoft Visual Basic Script file

VBS files are similar in concept to JS files but they often have control over operation system functionality normally off limits to JS, however VBS files are only supported by Microsoft Internet Explorer versus JS which is available on almost every popular web browser. VBS is generally only found on private Microsoft Windows based Intranet sites, though even there it's being replaced by newer offerings from Microsoft.

Image File Formats

Image file formats are files used to store and transfer photographs, vector art, logos, favicons and things like that. They're also used to store information artists use to modify the images in image manipulation applications such as Photoshop and Fireworks.

*.BMP Bitmap Graphic file

BMP files are an early image format with little or no compression applied. BMP files are sometimes found on websites, but have generally been replaced by formats such as PNG, GIF, and JPG in recent years.

*.GIF Graphic Interchange Format file

GIF files are image files that sometimes include simple frame-based animation and alpha transparency. PNG has gained popularity over GIF in recent years do the the superior color quality of PNG, however transparency problems with Microsofts Internet Explorer 6 have slowed the transition from GIF to PNG.

*.JPG / *.JPEG Joint Photographic Group image file

JPG files are image files designed specifically for photos. The compression method used has been optimized for working with complicated images, making it especially well suited for transferring photos on the web.

*.PNG Portable Network Graphic file

PNG is an image file format that has gained popularity over GIF in recent years do the the superior color quality of PNG, however transparency problems with Microsofts Internet Explorer 6 have slowed the transition from GIF to PNG. PNG is also the native file format used by Adobe/Macromedia Fireworks, similar to how PSD is the native file format used by Adobe Photoshop. PNG has the ability to store information about layers when used in Fireworks in much the same way that PSD can store information about layers in Photoshop.

*.PSD Photoshop Document file

PSD is an image file format and is the native format used by Adobe Photoshop. PSD allows image authors to store information about layers in an image when edited in Photoshop in much the same way that PNG enables layer information when edited in Fireworks. Unlike PNG which generally has multiple export modes and can be exported for use on the web, PSD is primarily used for raw or drafts before exporting the finished pieces of the image to other image formats such as PNG, GIF, or JPG.

*.SVG Scalable Vector Graphics file

SVG files are vector image files. Unlike common raster image formats that store information about each pixel in the image, vector image formats only store information about how to draw the image, which generally consists of coordinates to place simple shapes and how to fill them with color.

Audio/Video File Formats

Audio/Video file formats are used to store and transfer audio and video.

*.AVI Audio Video Interleave file

AVI files are commonly used to distribute video.

*.FLV Flash Video file

FLV files are usually not observed directly, but rather played through an SWF file. FLV files are used for distributing streaming video.

*.MOV / *.QT Apple QuickTime Video Clip file

MOV or QT files are used to distribute video, similar to AVI files. Quicktime is generally known for a great level of quality in video.

*.MP3 MPEG Audio Stream, Layer III file

MP3 files are used to distribute audio. MP3 files use a compressed version of the audio information and are generally many times smaller in size compared to WAV files, despite including extra information about the audio such as the author, title, album name, and description.

*.MPEG / *.MPG MPEG 1 System Stream file

MPEG or MPG files are used to distribute video, similar to AVI, MOV, and QT files.

*.WAV Waveform Audio file

WAV files are an early audio format file which has all but been replaced by MP3 on the web due to superior compression and information embedding provided by the MP3 audio file format.

*.WMA Microsoft Windows Media Audio file

WMA is an audio file similar to MP3 and WAV developed by Microsoft. Like WAV, WMA has been replaced to a large extend on the web by MP3.

Compression File Formats

Compression file formats are used to apply compression to single, or groups of files in order to reduce the size needed to store and transfer the files, as well as making large groups of files easier to transfer at once.

*.BZ2 BZIP2 compressed archive file

BZ2 files are generally used by programmers to distribute compressed source code packages.

*.GZ GZIP compressed archive

GZ files are sometimes used in the same way BZ2 files are, for programmers to distribute source code packages. Some systems use GZIP compression on their old log files to save space which results in file names such as "date.log.gz". Some web servers use GZIP compression to compress data before sending it from a server to a browser.

*.RAR winRAR compressed archive

RAR files are often used by individuals to compress and transfer large files, or large groups of files. Similar to GZ, BZ2, and ZIP files.

*.ZIP Compressed archive file

ZIP files are similar to RAR, BZ2, and GZ files however ZIP files are generally a lot more common. ZIP files package and to a point compress other files to be transferred from system to system.

Misc File Formats

Misc file formats are generally not designed specifically for web sites, but have found numerous uses on them.

*.*.BAK Backup copy of file

Backup. The second asterisk is generally the original file extension.Rather than trying to open this file, you will usually rename it and remove the .BAK from the end, then open it with the appropriate application

*.DOC Microsoft Word document

DOC files are often used by companies to distribute bundled information suitable for printing. DOC files are a favorite of companies for sending contracts via e-mail. DOC files are similar in concept to PDF files.

*.INI Information/configuration file

INI files are commonly used to store persistent configuration data for web servers and web browsers. INI files are also used in the same way for applications that have nothing to do with web design. Using XML files to store configuration data instead of INI files has been gaining popularity in recent years.

*.KML Google Earth/Keyhole place mark file

KML files are used to share specific locations, or placeholders in the popular Google Earth application. KML files are becoming increasingly popular on business web sites who want to provide directions for customers.

*.PDF Acrobat Portable Document Format file

PDF files are often used by companies to distribute bundled information suitable for printing, similar to DOC files. Some print shops will ask for PDF files when you need to have fliers or brochures printed out. Some companies will make PDF copies of important information found on their web site available for people do download and read offline, or print out and read away from their desk.

*.RDF Resource Description Framework file

RDF files are often used to deliver RSS content. RDF RSS feeds are generally more specialized in comparison to the generic XML versions of RSS files.

*.TXT Plain text file

TXT files are plain text files. TXT files generally display the characters as-is with no consideration for special style codes.

*.XML Extensible Markup Language file

XML files are generally used to store structured information. Many web sites use AJAX to load XML in the background and update small portions of web pages without needing the server to generate the entire web page over again with slightly changed information. XML files are also gaining popularity as a settings or preference file format. The generic yet descriptive format used to section data off makes XML the perfect file format for storing raw information that will be displayed in many different ways.

Server File Formats

Server file formats are generally scripts that web servers execute in order to produce web pages dynamically. these types of files are also often used to perform maintenance on the server through automation.

*.ASP Microsoft Active Server Pages

ASP files are an earlier version of ASPX files. ASP files are similar in concept to CFML, PHP, ASPX, RB, and JSP files.

*.ASPX Microsoft ASP.NET source code file

ASPX files are a more recent version of ASP files. ASPX files are used on web servers to prepare HTML documents to send to a web browser. ASPX files are similar in concept to ASP, CFML, PHP, RB, and JSP files.

*.CFM / *.CFML Coldfusion Markup Language file

CFML files are used on web servers to prepare HTML documents to send to a web browser. CFML files are similar in concept to ASPX, PHP, ASP, RB, and JSP files.

*.CGI Common Gateway Interface file

CGI files are generally used to provide a bridge, or gateway, between a web server and applications installed on the server such as Python, Perl, sed, and many others.

*.JSP Java Server Pages

JSP files are used on web servers to prepare HTML documents to send to a web browser. JSP files are similar in concept to ASPX, PHP, ASP, RB, and CFML files.

*.LOG Log file

LOG files are generally exactly what the name implies, a log. Web servers use them to store details about each visit to a web site as well as to store information useful to administrators and developers in the event of a problem. Old, or archived LOG files are often compressed using GZIP compression, which results in "filename.log.gz" types of file names.

*.MDB Microsoft Access Database file

MDB files are a Microsoft database data storage format. The MySQL database engine has been gaining popularity over MDB in recent years. There are many MDB to MySQL converters and conversion methods available on-line.

I posted a how to for Importing MDB to MySQL on a Debian server not too long ago.

*.PHP Hypertext PreProcessor script file

PHP files are used on web servers to prepare HTML documents to send to a web browser. PHP files are similar in concept to ASPX, JSP, RB, ASP, and CFML files. PHP is one of the most popular pre-processors used on the web today.

*.PL Perl script file

PL files are Perl script files. Perl is one of the first scripting languages to be used on the web. Perl is often used in conjunction with CGI. In recent years PHP has stolen much of Perls thunder, leaving Perl used primarily by veterans.

*.RB Ruby script / Ruby on Rails class file

RB files are used on web servers to prepare HTML documents to send to a web browser. RB files are similar in concept to ASPX, JSP, PHP, and CFML files.

*.SHTML / *.SSI HTML including Server Side Includes

SHTML files are used on web servers to tell the web server that there's lines in the file that need to be replaced with the contents of other files before sending the contents of the file to the web browser. SHTML is generally a very simple version of what file types such as ASP, ASPX, CFML, JSP, PHP, and RB are designed for.

*.SQL Structured Query Language data file

SQL files generally contain the description of a database and the database table layouts. In some cases SQL files will contain information dumps from database tables, though this is generally reserved for small databases.

.htaccess Apache dynamic configuration file

.htaccess files are commonly used to make runtime adjustments to an Apache web server configuration without needing to restart the Apache server. Not all configuration options are able to be altered via .htaccess and depending on the servers configuration at start up there might not be any options that can be configured via .htaccess

Saturday, March 22, 2008

MySQL ORDER BY multiple columns

Given this MySQL query where the single letter column names relate to the format argument of the PHP date function,

SELECT Y,n,j,G
FROM my_table
ORDER BY Y,n,j,G DESC
LIMIT 1


One might assume the most recent row of the following two would be returned.
In case you don't know these off the top of your head the order is,
hour, day, month, year

INSERT INTO my_table (G, j, n, Y) VALUES 
(23, 27, 10, 2007),
(0, 28, 10, 2007)


However that's not the case & the earlier of the two is actually returned.

Reason being that when a MySQL result is being ordered by multiple columns, the ASC or DESC keyword needs to be specified after each of the column names.

So the following query would return the later of the two rows.

SELECT Y,n,j,G
FROM my_table
ORDER BY Y DESC,n DESC,j DESC,G DESC
LIMIT 1

Wednesday, December 26, 2007

Database Storage of Apache Virtual Hosts

I came up with a workaround for storing Apache Virtual Hosts in a database. I figure it could be adapted to work on Windows using batch files, but I'm only going to go over doing it on a UNIX-like system here. Specificly Ubuntu, or Debian.

At least basic knowledge of the following is assumed.


The first thing that needs to be done is creating a database schema for storing virtual host containers. I'll stick with a simple schema for now so focus stays on Apache to Database connection.

This schema is for MySQL.
CREATE TABLE `vhosts` (
`active` BOOL NOT NULL DEFAULT '0',
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`container` TEXT CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ,
INDEX ( `active` )
);


`container` will store the actual vhost container. For instance somthing like this.
NameVirtualHost *
<VirtualHost *>
ServerAdmin webmaster@domain.com
DocumentRoot /home/me/public_html/
</VirtualHost>


`id` is for relation to other tables. Pretty standard stuff.

`active` is an on/off switch for each virtual host. Apache will only include a virtual host from the database if it is active.


With a database table ready to be read from we can move to the part where Apache uses the data from that table.
In httpd.conf, apache#.conf, or whatever your system uses for Apaches' main configuration file there's a directive available called Include. If you edit your configuration file and look at the end of the file you may see an example of Include being used already.
If you're on a Debian system you're likely going to find somthing similar to the following.
# Include the virtual host configurations:
Include /etc/apache2/sites-enabled/


If you're not on a Debian system, you're going to need to create a directory in your Apache directory similar to that. It should be owned by root. You will also need to add an Include directive like the one shown above to your main conf file pointing to the directory you create so Apache knows to load any configuration files found in that directory.

If you are on a Debian system you're eventually going to be emptying that directory so there's no conflicts with the database method. So now would probably be a good time to import the sites from sites-available into the database & mark the `active` field according to which sites are also in `sites-enabled`, as well as backing up that directory.


Now that we have a database to store and retrieve virtual host containers from, & a place where Apache can load configuration files from it's time to write a command to build a configuration file full of virtual host containers from the database for Apache to load.

Here's an example of such a command. Note that you'll have to sort out permissions on your own, the --user & --pass are used here for example sake, though MySQL documentation warns against doing it that way because it leaves the password visible in the pid file for a short time.

Some options are on a new line for ease of reading. The command ends with output redirection to a file in the sites-enabled directory for Apache to load.
I'm using a UNIX domain socket here, but you could use a TCP connection if need be. You might also want to alter the file output redirection is beind aimed at.
mysql --user=root --pass=pass --batch --raw --silent
--database=Websites
--execute="SELECT container FROM vhosts WHERE active = 1"
--skip-column-names
--protocol=socket
--socket=/var/run/mysqld/mysqld.sock
>/etc/apache2/sites-enabled/vhosts.conf


Of course instead of using that directory as an include directory in Apaches' configuration file, you could comment out that particular include & point an include directly at the file you have the above command outputting to.

Now we have a method to connect Apache with a Database for obtaining virtual hosts, but there's still manual labor involved in needing to run that mysql command when Apache is [re]started or reloaded.

To help with this we will have to edit the shell script used to stop/start/restart Apache so that every time Apache is [re]started, or reloaded, this file is refreshed.
On Debian this script is /etc/init.d/apache2, since I'm using Apache2.

Basicly anywhere there is a log_daemon_msg that [re]starts Apache in this script, we add our own logging & our database command right before it. So we end up with some sections looking somthing like this.
log_daemon_msg "Extracting vhosts from database" "apache2"
mysql --user=root --pass=pass --batch --raw --silent --database=Websites --execute="SELECT container FROM vhosts WHERE active = 1" --skip-column-names --protocol=socket --socket=/var/run/mysqld/mysqld.sock >/etc/apache2/vhosts.conf
log_daemon_msg "Starting web server" "apache2"


Now whenever Apache is [re]started, or reloaded, it will be using the vhost containers for active sites stored in the database.
sudo /etc/init.d/apache2 restart

Saturday, December 15, 2007

MySQL & Microsoft Access Databases (mdb)

Update 12/17: There's a Perl script controller here that will export as an SQL file. It handles converting between each databases datatypes.

After a quick search for a way to import an MDB into MySQL didn't return anything I liked. I remembered I have a Ubuntu computer now & started looking for the pieces instead.

What I ended up doing boils down to this.
  1. Install mdbtools
  2. Determine table names
  3. Export table to CSV
  4. Determine column names & required data types
  5. Create tables in MySQL
  6. LOAD DATA INFILE
  7. PROCEDURE ANALYSE()
  8. Tweak data types


The first thing I found was a package called mdbtools. The package includes some MDB command line utilities. So far this has been usefull for extracting database tables from MDB files in CSV format.

As usual the package was easy to install from the command line.
sudo apt-get install mdbtools

I needed to use two of the utilities to extract my tables. The first one is mdb-tables which lists the names of the tables in the MDB.
me@box:/mdb$ mdb-tables database.mdb
table_one table_two


Now that I knew the exact table names I was able to use the second utility, mdb-export, to export the tables.
me@box:/mdb$ mdb-export
Usage: mdb-export [options] <file> <table>
where options are:
-H supress header row
-Q don't wrap text-like fields in quotes
-d <delimiter> specify a column delimiter
-R <delimiter> specify a row delimiter
-I INSERT statements (instead of CSV)
-D <format> set the date format (see strftime(3) for details)
-S Sanitize names (replace spaces etc. with underscore)
-Q <char> Use <char> to wrap text-like fields. Default is ".
-X <char> Use <char> to escape quoted characters within a field. Default is doubling.
me@box:/mdb$ mdb-export -X \ -H database.mdb table_one > table_one.csv


Now it's worth noteing the first time I didn't think to tell the output where to go and I ended up with hundreds of table rows scrolling across my screen until I pressed CTRL+C to hault it.

From there I opened the CSV file in Notepad2 and found out the column names by looking at the first line in the file.
I created a new database table with the largest possible data types in MySQL using phpMyAdmin.

Once I had my database table created, I used a LOAD DATA INFILE to import the CSV file into the table.
LOAD DATA
INFILE '/mdb/table_one.csv'
INTO TABLE database.table_one
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
IGNORE 1 LINES


After a few seconds I had a table full of data. I used phpMyAdmins' Propose table structure, which runs the following query.
SELECT * 
FROM `table_one`
PROCEDURE ANALYSE()


Using the info provided by that, and a little planning for what will happen with the database later, I was able to tweak the table a little so that it doesn't need to use the largest possible data types for columns. I just wanted to make sure I didn't truncate anything during the import.

It might be worth rigging this to work with the phpMyAdmin import controll panel, but I'll save that for another day.