Sunday, August 26, 2007

cross-check results using triggers and views

I've been struggling to come up with a decent prototype for the self-verifying random testcases I wrote about earlier. In the mean time, I wrote a set of triggers and a view which should yield the same information, no matter what you do to the underlying table. If they don't, there's a bug in the server.

The test table, t1 is simple. It has an int and a varchar column. There are triggers after insert/update/delete. The triggers update an evolving summary table.

The evolving summary table maintains the average+sum value of the int, and the average+sum length of the varchar field. The view selects these values from the table too.

To test it, I ran 50 threads of random insert/update/delete queries to table t1 for a few minutes. Then after the test had stopped I confirmed the results matched. (i.e. the running totals had been correctly maintained - if they match the view).

Here's the setup of table t1, the running summary table, and the view

Next, I hit the server with random update/insert/delete for a few minutes:


- completed spawning new database worker threads
- 15 threads running, 0003961 successful queries. (330.083333 QPS).
- 15 threads running, 0005334 successful queries. (114.416667 QPS).
- 15 threads running, 0006206 successful queries. (72.666667 QPS).
- 15 threads running, 0006783 successful queries. (48.083333 QPS).
- 15 threads running, 0007779 successful queries. (83.000000 QPS).
- 15 threads running, 0008981 successful queries. (100.166667 QPS).
- 15 threads running, 0010393 successful queries. (117.666667 QPS).
- 15 threads running, 0010979 successful queries. (48.833333 QPS).
- 15 threads running, 0011373 successful queries. (32.833333 QPS).
- 15 threads running, 0011875 successful queries. (41.833333 QPS).
- 15 threads running, 0012399 successful queries. (43.666667 QPS).
- 15 threads running, 0012870 successful queries. (39.250000 QPS).
- 15 threads running, 0013566 successful queries. (58.000000 QPS).
- 15 threads running, 0013775 successful queries. (17.416667 QPS).
- 15 threads running, 0014001 successful queries. (18.833333 QPS).
- 15 threads running, 0014265 successful queries. (22.000000 QPS).
- 15 threads running, 0014869 successful queries. (50.333333 QPS).
- 15 threads running, 0015325 successful queries. (38.000000 QPS).
- 15 threads running, 0016425 successful queries. (91.666667 QPS).
- 15 threads running, 0016743 successful queries. (26.500000 QPS).
- 15 threads running, 0016919 successful queries. (14.666667 QPS).
- 15 threads running, 0017180 successful queries. (21.750000 QPS).
- 15 threads running, 0017448 successful queries. (22.333333 QPS).
- 15 threads running, 0017745 successful queries. (24.750000 QPS).
- 15 threads running, 0017997 successful queries. (21.000000 QPS).
- 15 threads running, 0018095 successful queries. (8.166667 QPS).
- waiting for threads to finish



And - let's see if the results match up.


mysql> select * from t1_summary;
+--------------+-------------+-----------------+--------------------+-----------------+
|num_elements | sum_id | sum_length_name | avg_id | avg_length_name |
| 10237 | 76231668630 | 1198066 | 7446680.5343377098 | 117.0329195294 |
+--------------+-------------+-----------------+--------------------+-----------------+
1 row in set (0.00 sec)

mysql> select * from t1_view;
+-------------+-----------------+--------------+-----------------+
|sum_id | sum_length_name | avg_id | avg_length_name |
+-------------+-----------------+--------------+-----------------+
|76231668630 | 1198066 | 7446680.5343 | 117.0329 |
+-------------+-----------------+--------------+-----------------+
1 row in set (0.05 sec)



Yup, they do :) What have I actually tested here ?? Hm. At least one thing. The trigger's execution didn't miss anything. Why is this first step important ?? Because my program doesn't know the correct answer to the above 2 queries. It sends random garbage to server, hence it doesn't know how to verify a result. Having the server keep a running total, and comparing that to a final total from the view, is some very small assurance I think.

Next step will be to use more detailed functions instead of SUM and AVG. Something whereby the order of execution will matter... More on this later.

Saturday, August 25, 2007

how to undelete rows from a fixed length myisam table

You have inadvertently deleted some rows from your table and want them back. This is semi-doable with fixed row format of MyISAM. I put together a few steps, which I'll compliment with a program that does it for you, later.

  • save output of SHOW TABLE STATUS LIKE 't1' and SHOW CREATE TABLE `t1`.
  • shutdown mysql server asap!
  • backup t1.frm, t1.MYI, t1.MYD immediately
  • create a new table t1_recover, which has no auto-inc and no unique/pk.
  • remove the files t1_recover.MYD and t1_recover.MYI
  • write a C program that scans through t1.MYD reading blocks of length Avg_row_length and checking if first byte indicates a row is marked as deleted.
  • if it's deleted, dump it as is into the file t1_recover.MYD
  • goto mysql prompt. issue REPAIR TABLE `t1_recover` USE_FRM
Some notes.

You cannot recover entire record. You'll lose first 6 bytes of each record (or how every long data pointer length is). Typically, this is the first 2 int columns trashed.

You cannot determine which columns contained NULL in the recovered records (assuming some columns are nullable).

If you anticipate to have to use this recovery method often, then you can create a fake char(9) column at the beginning of the table, so that when a row is deleted, this column will be overridden instead of your real data..

Here is the code! myisam undeleter

Example usage below:
1) create a table and delete some records:

mysql> create table mydel(id int, name char(15), address char(15), bday datetime, daysleft decimal(10,5));
Query OK, 0 rows affected (0.06 sec)

mysql> insert into mydel values (1,'shane','down south','2005-09-01',10.5);
Query OK, 1 row affected (0.01 sec)

mysql> insert into mydel values (2,'susan car','up north','2001-02-01',14.1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into mydel values (3,'bobby scott','chinatown','1976-12-21',55.5);
Query OK, 1 row affected (0.00 sec)

mysql> delete from mydel where id=2 or id=3;
Query OK, 2 rows affected (0.00 sec)

mysql> flush tables;
Query OK, 0 rows affected (0.00 sec)

mysql> show table status;
+--------+--------+------------+------+----------------+-------------+
| Name | Engine | Row_format | Rows | Avg_row_length | Data_length |
+--------+--------+------------+------+----------------+-------------+
| mydel | MyISAM | Fixed | 1 | 49 | 147 |
+--------+--------+------------+------+----------------+-------------+

2) next, build and run the app:


> gcc -o undeleter -lm ./undeleter.c
>
> ./undeleter /5.1/data/test/mydel.MYD /5.1/data/test/mydel_recover.MYD 49
allocating a block of memory for records, 49 bytes
starting to read records
found deleted record at offset 49
found deleted record at offset 98
-----------------
completed
-----------------
records copied into '/5.1/data/test/mydel_recover.MYD': 2
records present in '/5.1/data/test/mydel.MYD': 1

next steps:
1. remove the MYI for '/5.1/data/test/mydel_recover.MYD'
2. make sure '/5.1/data/test/mydel_recover.MYD' definition .frm is created without any unique or PK keys
3. repair table use_frm


3) next, sort out the recovered table


sbester@www:/5.1/data/test> cp ./mydel.frm ./mydel_recover.frm

mysql> repair table mydel_recover use_frm;
+--------------------+--------+----------+------------------------------------+
| Table | Op | Msg_type | Msg_text |
+--------------------+--------+----------+------------------------------------+
| test.mydel_recover | repair | warning | Number of rows changed from 0 to 2 |
| test.mydel_recover | repair | status | OK |
+--------------------+--------+----------+------------------------------------+
2 rows in set (0.00 sec)

mysql> select * from mydel_recover;
+------+-------------+-----------+---------------------+----------+
| id | name | address | bday | daysleft |
+------+-------------+-----------+---------------------+----------+
| -1 | san car | up north | 2001-02-01 00:00:00 | 14.10000 |
| 0 | ☺bby scott | chinatown | 1976-12-21 00:00:00 | 55.50000 |
+------+-------------+-----------+---------------------+----------+
2 rows in set (0.00 sec)


Notice! I just cp the .frm. That's only because I didn't need to strip out the
PK or auto-increment column. As you seen, `id` is trashed. `name`
is partially trash. the `id` would have caused alot of duplicate key
errors if you still had a PK.

More information here: MySQL_Internals_MyISAM

Tuesday, August 21, 2007

new direction for test development

I've been thinking about this today. My attempts thus far to write random queries which stress various parts of the server have worked, and served a purpose - but have been rather huge and clumsy.

For example a 5 union select, each comprising of 7 joins did have it's value.. The problem with huge random queries is simply that the results cannot be verified as easily as they were constructed (unless you run the same on various DBMS). These huge tests are mostly only good for checking if the server crashes.

Now, I wish to try write testcases that have these properties:
  • small
  • results are self-verified
  • each test is random, but reproducible given a seed value
Tonight I'll throw together a code-generator to do simple insert/select and post any interesting results here later.

Monday, August 13, 2007

a moment in time

I pulled the latest 5.1BK sources today and built mysqld. I must admit, it fared pretty well against my little charset tester. 0 failures and over 120 tests done which took >5 hours.

To give a clue to the score last time I run the same tests, here are the bugs I opened!

Sunday, August 12, 2007

sad news

my girlfriend hates blogs :(