Based on pytorch
By Jing Wang and Siwei Yu ([email protected])
Center of Geophysics, Harbin Insititute of Technology, Harbin, China
If you find this toolbox useful, please cite the following paper (accepted by Geophysics):
Deep learning for denoising (
Note that the results from examples of this toolbox are not identical to those in the paper. The training set, test set, programming language are different.
- This code is used to generate sample data from .segy seismic data for deep learning based on pytorch.
- It can be used for denoising or interpolation for seismic data.
- This code is modified from KaiZhang.
- you own .segy or .sgy seismic data or you can download some .segy or .sgy data online by the code we provide
- the model we provided is trained with Model94_shots and 7m_shots_0201_0329 dataset (mode: DNCNN)
from get_patch import*
from gain import *
# original data generates patch
train_data = datagenerator(data_dir,patch_size = (128,128),stride = (32,32), train_data_num = float('inf'), download=False,datasets=[],aug_times=0,scales = [1],verbose=True,jump=1,agc=True)
train_data = train_data.astype(np.float64)
xs = torch.from_numpy(train_data.transpose((0, 3, 1, 2)))
# add noise
DDataset = DenoisingDataset(xs,25)
#random downsampling,rate : the sampling rate
DDataset = DownsamplingDataset(xs,rate = 0.7,regular = False)
#sampling regularly, rate : sampling interval
DDataset = DownsamplingDataset(xs,rate = 2,regular = True)
Parameters in datagenerator :
data_dir : the path of the .segy file exit or you want to download in
patch_size : the size the of patch
stride : when get patches, the step size to slide on the data
train_data_num: int or float('inf'),default=float('inf'),mean all the data will be used to Generate patches,
if you just need 3000 patches, you can set train_data_num=3000;
download(bool): whether you will download the dataset from the internet,and we provide 7 inline datasets,the order is
datasets(int) : the number of the datasets will be download in the datasets we provide if download = True,
e.g:dataset=2,it mean that you will download the 1.
and 2. two datasets.
aug_times(int) : the time of the aug you will perform,used to increase the diversity of the samples,in each time,
Choose one operation at a time,eg:flip up and down、rotate 90 degree and flip up and down
scales(list) : The ratio of the data being scaled. default = [1], no scale by default.
verbose(bool) : Whether to output the generate situation of the patches
jump(int) : default=1, mean that read shot one by one; when jump>=2, mean that don`t read the shot one by one
instead of with a certain interval,such as: jump=3,you will use the 1、4、7... shot data
agc(bool) : if use the agc(Normalize each trace by amplitude) of the data
- Note : the parameters "jump" is only available when the dimensions of each shot data are the same. And we provide a small .segy data in ‘data/test’ to test the "datagenerator" function or you can just run
to test and look at some of the data sets that are being visualized. Just like:
python --data_dir data/train
python --data_dir data/train
(Note: we suppose you have put the "segy" files in the "data/train" folder. If not, please use --download True --datasets 2 (2 means you want to use 2 datasets in the default library). Sometimes the network is not stable and the datasets cannot be downloaded. We provide a baiduyun link for some datasets here, link:
python --data_dir data/test --sigma 50
python --data_dir data/test --rate 2
- For more tasks: salt body classification、wave equation inversion and test for field data
- Parallel computing
- Support for matconvnet