Oracle redo logs and dd

I recently had a case with a customer concerned over the time taken to dd an Oracle redo log file even though Oracle had been shutdown. Finding this strange I thought I’d use TNF probes to see what was happening.

#!/bin/sh -x
# Script to collect TNF data whilst dd is happening on Oracle log file
#
# Andy H 19/09/06
#
TMPDIR=${TMPDIR:-/var/tmp/tnf}
output_file=$TMPDIR/ktrace.tnf
command="dd if=/u01/oracle/oradata/TEST/redo01.log of=/u01/oracle/oradata/TEST/tmp.tmp"
start_prex()
{
echo "Enabling kernel tracing"
prex -k < $TMPDIR/pid.out
wait $pid
fi
}
# Extract kernel trace buffer
ktrace_xtract() {
echo "Extracting kernel trace buffer"
tnfxtract $output_file || fail "Could not extract kernel trace buffer"
ls -l $output_file
}
# Get kernel symbols
ktrace_sym() {
echo "Collecting kernel symbol information"
/usr/ccs/bin/nm -fx /dev/ksyms | tr "|" " " > $TMPDIR/nm.out
}
# tar and gzip data
ktrace_gzip() {
echo "Packaging collected data"
/usr/sbin/tar cf /tmp/tnf.tar $TMPDIR
/usr/bin/gzip /tmp/tnf.tar
echo "Collection complete, please send in /tmp/tnf.tar.gz for analysis"
}
start_prex
run_command
ktrace_xtract
ktrace_sym
stop_prex
ktrace_gzip

Following using some clever sorting awk script from Tim Uglow I compared the TNF output from the normal dd timing and then the slow one.

64.922000  -  2562 1 0x300042edc20 0 page_unmap vnode: 0x600062ee440 offset: 157253632
64.924500  -  2562 1 0x300042edc20 0 page_unmap vnode: 0x600062ee440 offset: 157261824
64.927100  -  2562 1 0x300042edc20 0 page_unmap vnode: 0x600062ee440 offset: 157270016
64.929200  -  2562 1 0x300042edc20 0 page_unmap vnode: 0x600062ee440 offset: 157278208
64.931500  -  2562 1 0x300042edc20 0 page_unmap vnode: 0x600062ee440 offset: 157286400
101.904300  -  2562 1 0x300042edc20 0 syscall_end rval1: 4 rval2: 0 errno: 0 creat64 98.822400 No error
101.908200  -  2562 1 0x300042edc20 0 thread_queue tid: 0x300042edc20 cpuid: 0 priority: 0 queue_length: 0x0
101.908500  -  2562 1 0x300042edc20 0 thread_state tid: 0x300042edc20 state: 8 waiting for CPU
101.924000  -  2562 1 0x300042edc20 0 syscall_start sysnum: 137 sysconfig
101.924500  -  2562 1 0x300042edc20 0 syscall_end rval1: 8192 rval2: 18466816 errno: 0 sysconfig 0.000500 No error
101.928700  -  2562 1 0x300042edc20 0 syscall_start sysnum: 17 brk
101.931200  -  2562 1 0x300042edc20 0 syscall_end rval1: 0 rval2: 134217728 errno: 0 brk 0.002500 No error
101.932500  -  2562 1 0x300042edc20 0 syscall_start sysnum: 17 brk
101.941300  -  2562 1 0x300042edc20 0 syscall_end rval1: 0 rval2: 134217728 errno: 0 brk 0.008800 No error
101.943200  -  2562 1 0x300042edc20 0 thread_state state: 4 asleep in user data fault
101.945400  -  2562 1 0x300042edc20 0 address_fault address: 0x28000 fault_type: 0 access: 1
101.947900  -  2562 1 0x300042edc20 0 anon_zero address: 0x28000
101.967700  -  2562 1 0x300042edc20 0 thread_state state: 0 in userland
101.978500  -  2562 1 0x300042edc20 0 syscall_start sysnum: 98 sigaction
101.980600  -  2562 1 0x300042edc20 0 syscall_end rval1: 0 rval2: 0 errno: 0 sigaction 0.002100 No error
101.985800  -  2562 1 0x300042edc20 0 syscall_start sysnum: 3 read
102.000000  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8193 errno: 0 read 0.014200 No error
102.009800  -  2562 1 0x300042edc20 0 syscall_start sysnum: 4 write
102.073000  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8962 errno: 0 write 0.063200 No error
102.074500  -  2562 1 0x300042edc20 0 syscall_start sysnum: 3 read
102.079700  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8193 errno: 0 read 0.005200 No error
102.082300  -  2562 1 0x300042edc20 0 syscall_start sysnum: 4 write
102.088200  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8962 errno: 0 write 0.005900 No error
102.089200  -  2562 1 0x300042edc20 0 syscall_start sysnum: 3 read
102.093600  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8193 errno: 0 read 0.004400 No error
102.096200  -  2562 1 0x300042edc20 0 syscall_start sysnum: 4 write
102.129600  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8962 errno: 0 write 0.033400 No error
102.130900  -  2562 1 0x300042edc20 0 syscall_start sysnum: 3 read
102.136100  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8193 errno: 0 read 0.005200 No error
102.138600  -  2562 1 0x300042edc20 0 syscall_start sysnum: 4 write

So each of the read() / writes() happen one after each other…..but what about on the so called broken one:

59.433900  -  2635 1 0x300041792c0 3 page_unmap vnode: 0x600062ee440 offset: 157253632
59.436200  -  2635 1 0x300041792c0 3 page_unmap vnode: 0x600062ee440 offset: 157261824
59.438300  -  2635 1 0x300041792c0 3 page_unmap vnode: 0x600062ee440 offset: 157270016
59.440300  -  2635 1 0x300041792c0 3 page_unmap vnode: 0x600062ee440 offset: 157278208
59.442600  -  2635 1 0x300041792c0 3 page_unmap vnode: 0x600062ee440 offset: 157286400
95.378500  -  2635 1 0x300041792c0 3 syscall_end rval1: 4 rval2: 0 errno: 0 creat64 91.600000 No error
95.380600  -  2635 1 0x300041792c0 3 thread_queue tid: 0x300041792c0 cpuid: 3 priority: 0 queue_length: 0x0
95.380800  -  2635 1 0x300041792c0 3 thread_state tid: 0x300041792c0 state: 8 waiting for CPU
95.398100  -  2635 1 0x300041792c0 3 syscall_start sysnum: 137 sysconfig
95.398600  -  2635 1 0x300041792c0 3 syscall_end rval1: 8192 rval2: 18466816 errno: 0 sysconfig 0.000500 No error
95.404300  -  2635 1 0x300041792c0 3 syscall_start sysnum: 17 brk
95.406100  -  2635 1 0x300041792c0 3 syscall_end rval1: 0 rval2: 134217728 errno: 0 brk 0.001800 No error
95.407300  -  2635 1 0x300041792c0 3 syscall_start sysnum: 17 brk
95.415700  -  2635 1 0x300041792c0 3 syscall_end rval1: 0 rval2: 134217728 errno: 0 brk 0.008400 No error
95.417700  -  2635 1 0x300041792c0 3 thread_state state: 4 asleep in user data fault
95.419500  -  2635 1 0x300041792c0 3 address_fault address: 0x28000 fault_type: 0 access: 1
95.421600  -  2635 1 0x300041792c0 3 anon_zero address: 0x28000
95.440600  -  2635 1 0x300041792c0 3 thread_state state: 0 in userland
95.453200  -  2635 1 0x300041792c0 3 syscall_start sysnum: 98 sigaction
95.454800  -  2635 1 0x300041792c0 3 syscall_end rval1: 0 rval2: 0 errno: 0 sigaction 0.001600 No error
95.460900  -  2635 1 0x300041792c0 3 syscall_start sysnum: 3 read
95.483700  -  2635 1 0x300041792c0 3 strategy device: 118,22  block: 2262752 size: 0x200 buf: 0x60008105278 flags: 0x2200061
95.522200  -  2635 1 0x300041792c0 3 thread_block reason: 0x60008105338 stack: 
95.523200  -  2635 1 0x300041792c0 3 thread_state state: 7 asleep for other reason
95.523400  -  2635 1 0x300041792c0 3 thread_state tid: 0x2a10001fcc0 state: 1 in syscall/pagefault
95.939700  -  2635 1 0x300041792c0 17 syscall_end rval1: 512 rval2: 8193 errno: 0 read 0.478800 No error
95.972400  -  2635 1 0x300041792c0 17 syscall_start sysnum: 4 write
96.070000  -  2635 1 0x300041792c0 17 syscall_end rval1: 512 rval2: 8962 errno: 0 write 0.097600 No error
96.072000  -  2635 1 0x300041792c0 17 syscall_start sysnum: 3 read
96.078800  -  2635 1 0x300041792c0 17 strategy device: 118,22  block: 2262753 size: 0x200 buf: 0x60008105278 flags: 0x2200061
96.105100  -  2635 1 0x300041792c0 17 thread_block reason: 0x60008105338 stack: 
96.106100  -  2635 1 0x300041792c0 17 thread_state state: 7 asleep for other reason
96.106300  -  2635 1 0x300041792c0 17 thread_state tid: 0x2a1005d1cc0 state: 1 in syscall/pagefault

Interesting we appear to be blocking for a reason….ahhh following from a quick chat with John Alderson and reading his blog entry this now becomes clear.

Basically directio is still persistant on the file after Oracle is shutdown. DIRECTIO_ON allows you to force directio per-file and is used potentially by databases if their app tunable is so set (which in our case it is). The flag is set in the inode’s i_flag and hangs around cached within the inode in the DNLC and will effect files until the inode is flushed from the DNLC.