obspy.io.segy - SEG Y and SU read and write support for ObsPy

The obspy.io.segy package contains methods in order to read and write files in the SEG Y (rev. 1) and SU (Seismic Unix) format.

copyright:The ObsPy Development Team (devs@obspy.org)
license:GNU Lesser General Public License, Version 3 (https://www.gnu.org/copyleft/lesser.html)

Note

The module can currently read files that are in accordance to the SEG Y rev. 1 specification but has been designed to be able to handle custom headers as well. This functionality is not yet exposed because the developers have no files to test it with. If you have access to some files with custom headers please consider sending them to devs@obspy.org.

Reading

The SEG Y and Seismic Unix (SU) file formats are quite different from the file formats usually used in observatories (GSE2, MiniSEED, ...). The Stream/Trace structures of ObsPy are therefore not fully suited to handle them. Nonetheless they work well enough if some potential problems are kept in mind.

SEG Y files can be read in four different ways that have different advantages/disadvantages. Most of the following also applies to SU files with some changes (keep in mind that SU files have no file wide headers).

  1. Using the standard read() function.
  2. Using the obspy.io.segy specific obspy.io.segy.core._read_segy() function.
  3. Using the internal obspy.io.segy.segy._read_segy() function.
  4. Some SEG-Y files are too large to be read into memory. The obspy.io.segy.segy.iread_segy() function reads a large file trace by trace circumventing this problem.

Reading using methods 1 and 2

The first two methods will return a Stream object and they are identical except that the file wide SEGY headers are only accessible if method 2 is used. These headers are stored in Stream.stats.

The obvious advantage of these methods is that the returned Stream object interfaces very well with other functionality provided by ObsPy (file format conversion, filtering, ...).

Due to the fact that a single SEG Y file can contain several tens of thousands of traces and each trace will be a Trace instance which in turn will contain other objects these methods are quite slow and memory intensive.

To somewhat rectify this issue all SEG Y specific trace header attributes are only unpacked on demand by default.

>>> from obspy.io.segy.core import _read_segy
>>> from obspy.core.util import get_example_file
>>> # or 'from obspy import read' if file wide headers are of no interest
>>> filename = get_example_file("00001034.sgy_first_trace")
>>> st = _read_segy(filename)
>>> st 
<obspy.core.stream.Stream object at 0x...>
>>> print(st) 
1 Trace(s) in Stream:
Seq. No. in line:    1 | 2009-06-22T14:47:37.000000Z - 2009-06-22T14:47:41...

SEG Y files contain a large amount of additional trace header fields which are not unpacked by default. However these values can be accessed by calling the header key directly or by using the unpack_trace_headers keyword with the read()/ _read_segy() functions to unpack all header fields.

>>> st1 = _read_segy(filename)
>>> len(st1[0].stats.segy.trace_header)
8
>>> st1[0].stats.segy.trace_header.data_use # Unpacking a value on the fly.
1
>>> len(st1[0].stats.segy.trace_header) # This value will remain unpacked.
9
>>> st2 = _read_segy(filename, unpack_trace_headers=True)
>>> len(st2[0].stats.segy.trace_header)
92

Reading SEG Y files with unpack_trace_headers=True will be very slow and memory intensive for a large number of traces due to the huge number of objects created.

Reading using method 3

The internal reading method is much faster and less of a memory hog but does not return a Stream object. Instead it returns a SEGYFile object which is somewhat similar to the Stream object used in ObsPy but specific to segy.

>>> from obspy.io.segy.segy import _read_segy
>>> segy = _read_segy(filename)
>>> segy 
<obspy.io.segy.segy.SEGYFile object at 0x...>
>>> print(segy)
1 traces in the SEG Y structure.

The traces are a list of SEGYTrace objects stored in segy.traces. The trace header values are stored in trace.header as a SEGYTraceHeader object.

By default these header values will not be unpacked and thus will not show up in ipython’s tab completion. See obspy.io.segy.header.TRACE_HEADER_FORMAT (source) for a list of all available trace header attributes. They will be unpacked on the fly if they are accessed as class attributes.

By default trace data are read into memory, but this may be impractical for very large datasets. To skip loading data into memory, read SEG Y files with headonly=True. The data class attribute will not show up in ipython’s tab completion, but data are read directly from the disk when it is accessed:

>>> from obspy.io.segy.segy import _read_segy
>>> segy = _read_segy(filename, headonly=True)
>>> print(len(segy.traces[0].data))
2001

Iteratively reading a file using method 4

If a file is too big to be read into memory in one go (as all the other methods do), read the file trace by trace using the obspy.io.segy.segy.iread_segy() function:

>>> from obspy.io.segy.segy import iread_segy
>>> for trace in iread_segy(filename):
...    # Do something meaningful.
...    print(int(trace.data.sum() * 1E9))
-5

Writing

Writing ObsPy Stream objects

Writing Stream objects is done in the usual way.

>>> st.write('file.segy', format='SEGY') 

or

>>> st.write('file.su', format='SU') 

It is possible to control the data encoding, the byte order and the textual header encoding of the final file either via the file wide stats object (see sample code below) or directly via the write method. Possible values and their meaning are documented here: _write_segy()

Writing SEGYFile objects

SEGYFile objects are written using its write() method. Optional kwargs are able to enforce the data encoding and the byte order.

>>> segy.write('file.segy') 

Converting other file formats to SEG Y

SEGY files are sensitive to their headers and wrong headers might break them.

If some or all headers are missing, obspy.io.segy will attempt to autogenerate them and fill them with somehow meaningful values. It is a wise idea to manually check the headers because some other programs might use them and misinterpret the data. Most header values will be 0 nonetheless.

One possibility to get valid headers for files to be converted is to read one correct SEG Y file and use its headers.

The other possibility is to autogenerate the headers with the help of ObsPy and a potential manual review of them which is demonstrated in the following script:

from obspy import read, Trace, Stream, UTCDateTime
from obspy.core import AttribDict
from obspy.io.segy.segy import SEGYTraceHeader, SEGYBinaryFileHeader
from obspy.io.segy.core import _read_segy
import numpy as np
import sys

stream = Stream()

for _i in xrange(3):
    # Create some random data.
    data = np.random.ranf(1000)
    data = np.require(data, dtype=np.float32)
    trace = Trace(data=data)

    # Attributes in trace.stats will overwrite everything in
    # trace.stats.segy.trace_header
    trace.stats.delta = 0.01
    # SEGY does not support microsecond precision! Any microseconds will
    # be discarded.
    trace.stats.starttime = UTCDateTime(2011,11,11,11,11,11)

    # If you want to set some additional attributes in the trace header,
    # add one and only set the attributes you want to be set. Otherwise the
    # header will be created for you with default values.
    if not hasattr(trace.stats, 'segy.trace_header'):
        trace.stats.segy = {}
    trace.stats.segy.trace_header = SEGYTraceHeader()
    trace.stats.segy.trace_header.trace_sequence_number_within_line = _i + 1
    trace.stats.segy.trace_header.receiver_group_elevation = 444

    # Add trace to stream
    stream.append(trace)

# A SEGY file has file wide headers. This can be attached to the stream
# object.  If these are not set, they will be autocreated with default
# values.
stream.stats = AttribDict()
stream.stats.textual_file_header = 'Textual Header!'
stream.stats.binary_file_header = SEGYBinaryFileHeader()
stream.stats.binary_file_header.trace_sorting_code = 5

print "Stream object before writing..."
print stream

stream.write("TEST.sgy", format="SEGY", data_encoding=1,
             byteorder=sys.byteorder)
print "Stream object after writing. Will have some segy attributes..."
print stream

print "Reading using obspy.io.segy..."
st1 = _read_segy("TEST.sgy")
print st1

print "Reading using obspy.core..."
st2 = read("TEST.sgy")
print st2

print "Just to show that the values are written..."
print [tr.stats.segy.trace_header.receiver_group_elevation
       for tr in stream]
print [tr.stats.segy.trace_header.receiver_group_elevation for tr in st2]
print stream.stats.binary_file_header.trace_sorting_code
print st1.stats.binary_file_header.trace_sorting_code

Modules

core SEG Y bindings to ObsPy core module.
segy Routines to read and write SEG Y rev 1 encoded seismic data files.
util
header Defines the header structures and some other dictionaries needed for SEG Y read
pack Functions that will all take a file pointer and the sample count and return a
unpack Functions that will all take a file pointer and the sample count and return a