Template matching with a single template¶

Templates are selected from Weiqiang Zhu's PhaseNet catalog.

Data requirements¶

Download the seismic data at: https://doi.org/10.5281/zenodo.15097180

Download the PhaseNet earthquake catalog with the following three commands:

curl -o adloc_events.csv https://storage.googleapis.com/quakeflow_share/Ridgecrest/adloc/adloc_events.csv
curl -o adloc_picks.csv https://storage.googleapis.com/quakeflow_share/Ridgecrest/adloc/adloc_picks.csv
curl -o adloc_stations.csv https://storage.googleapis.com/quakeflow_share/Ridgecrest/adloc/adloc_stations.csv

Installing `fast_matched_filter`¶

This example uses fast_matched_filter, a Python wrapper for C and CUDA-C routines. The C and CUDA-C libraries have to be compiled. Read the instructions at https://ebeauce.github.io/FMF_documentation/introduction.html.

In [3]:

Copied!





import os
import fast_matched_filter as fmf
import glob
import numpy as np
import matplotlib.pyplot as plt
import obspy as obs
import pandas as pd
import os
import fast_matched_filter as fmf
import glob
import numpy as np
import matplotlib.pyplot as plt
import obspy as obs
import pandas as pd

In [4]:

Copied!





# path variables and file names
DIR_WAVEFORMS = "/home/ebeauce/SSA_EQ_DETECTION_WORKSHOP/data" # REPLACE THIS PATH WITH WHEREVER YOU DOWNLOADED THE DATA
DIR_CATALOG = "../picks_phasenet/"

STATION_FILE = "adloc_stations.csv"
EVENT_FILE = "adloc_events.csv"
PICK_FILE = "adloc_picks.csv"
# path variables and file names
DIR_WAVEFORMS = "/home/ebeauce/SSA_EQ_DETECTION_WORKSHOP/data" # REPLACE THIS PATH WITH WHEREVER YOU DOWNLOADED THE DATA
DIR_CATALOG = "../picks_phasenet/"

STATION_FILE = "adloc_stations.csv"
EVENT_FILE = "adloc_events.csv"
PICK_FILE = "adloc_picks.csv"

Load PhaseNet catalog¶

Here, we read the catalog of the 2019 Ridgecrest sequence made with PhaseNet. Information is divided into three files:

a station metadata file,
an event metadata file (the catalog per se),
a pick database, which contains all the P- and S-wave picks found by PhaseNet.

In [5]:

Copied!

station_meta = pd.read_csv(os.path.join(DIR_CATALOG, STATION_FILE))
station_meta
station_meta = pd.read_csv(os.path.join(DIR_CATALOG, STATION_FILE))
station_meta

Out[5]:

	network	station	location	instrument	component	latitude	longitude	elevation_m	depth_km	provider	station_id	station_term_time_p	station_term_time_s	station_term_amplitude
0	CI	CCC	NaN	BH	ENZ	35.524950	-117.364530	670.0	-0.6700	SCEDC	CI.CCC..BH	0.266086	0.500831	0.049399
1	CI	CCC	NaN	HH	ENZ	35.524950	-117.364530	670.0	-0.6700	SCEDC	CI.CCC..HH	0.295428	0.518465	0.191475
2	CI	CCC	NaN	HN	ENZ	35.524950	-117.364530	670.0	-0.6700	SCEDC	CI.CCC..HN	0.296263	0.541148	0.064485
3	CI	CLC	NaN	BH	ENZ	35.815740	-117.597510	775.0	-0.7750	SCEDC	CI.CLC..BH	-0.231963	-0.415271	-0.331371
4	CI	CLC	NaN	HH	ENZ	35.815740	-117.597510	775.0	-0.7750	SCEDC	CI.CLC..HH	-0.168743	-0.390045	-0.140313
5	CI	CLC	NaN	HN	ENZ	35.815740	-117.597510	775.0	-0.7750	SCEDC	CI.CLC..HN	-0.175671	-0.388116	-0.249066
6	CI	DTP	NaN	BH	ENZ	35.267420	-117.845810	908.0	-0.9080	SCEDC	CI.DTP..BH	-0.305881	-0.602459	-0.503411
7	CI	DTP	NaN	HH	ENZ	35.267420	-117.845810	908.0	-0.9080	SCEDC	CI.DTP..HH	-0.263705	-0.564867	-0.437951
8	CI	DTP	NaN	HN	ENZ	35.267420	-117.845810	908.0	-0.9080	SCEDC	CI.DTP..HN	-0.244383	-0.538990	-0.500516
9	CI	JRC2	NaN	BH	ENZ	35.982490	-117.808850	1469.0	-1.4690	SCEDC	CI.JRC2..BH	0.011361	-0.080285	-0.039941
10	CI	JRC2	NaN	HH	ENZ	35.982490	-117.808850	1469.0	-1.4690	SCEDC	CI.JRC2..HH	0.053539	-0.052748	0.068213
11	CI	JRC2	NaN	HN	ENZ	35.982490	-117.808850	1469.0	-1.4690	SCEDC	CI.JRC2..HN	0.059764	-0.045991	-0.007637
12	CI	LRL	NaN	BH	ENZ	35.479540	-117.682120	1340.0	-1.3400	SCEDC	CI.LRL..BH	-0.295604	-0.540857	0.033788
13	CI	LRL	NaN	HH	ENZ	35.479540	-117.682120	1340.0	-1.3400	SCEDC	CI.LRL..HH	-0.268381	-0.513955	0.146876
14	CI	LRL	NaN	HN	ENZ	35.479540	-117.682120	1340.0	-1.3400	SCEDC	CI.LRL..HN	-0.266329	-0.503088	0.045499
15	CI	LRL	2C	HN	ENZ	35.479540	-117.682120	1340.0	-1.3400	SCEDC	CI.LRL.2C.HN	0.000000	0.000000	0.000000
16	CI	MPM	NaN	BH	ENZ	36.057990	-117.489010	1839.0	-1.8390	SCEDC	CI.MPM..BH	-0.011825	-0.098089	-0.518793
17	CI	MPM	NaN	HH	ENZ	36.057990	-117.489010	1839.0	-1.8390	SCEDC	CI.MPM..HH	0.009095	-0.081896	-0.459349
18	CI	MPM	NaN	HN	ENZ	36.057990	-117.489010	1839.0	-1.8390	SCEDC	CI.MPM..HN	0.011870	-0.052277	-0.532764
19	CI	Q0072	01	HN	ENZ	35.609617	-117.666721	695.0	-0.6950	SCEDC	CI.Q0072.01.HN	0.000000	0.000000	0.000000
20	CI	SLA	NaN	BH	ENZ	35.890950	-117.283320	1174.0	-1.1740	SCEDC	CI.SLA..BH	0.066893	0.118500	-0.081634
21	CI	SLA	NaN	HH	ENZ	35.890950	-117.283320	1174.0	-1.1740	SCEDC	CI.SLA..HH	0.089833	0.128589	-0.042928
22	CI	SLA	NaN	HN	ENZ	35.890950	-117.283320	1174.0	-1.1740	SCEDC	CI.SLA..HN	0.093526	0.168238	-0.150381
23	CI	SRT	NaN	BH	ENZ	35.692350	-117.750510	667.0	-0.6670	SCEDC	CI.SRT..BH	0.148171	0.653411	-0.253527
24	CI	SRT	NaN	HH	ENZ	35.692350	-117.750510	667.0	-0.6670	SCEDC	CI.SRT..HH	0.183868	0.641707	-0.208859
25	CI	SRT	NaN	HN	ENZ	35.692350	-117.750510	667.0	-0.6670	SCEDC	CI.SRT..HN	0.175475	0.674587	-0.295869
26	CI	TOW2	NaN	BH	ENZ	35.808560	-117.764880	685.0	-0.6850	SCEDC	CI.TOW2..BH	0.268083	0.816279	0.028038
27	CI	TOW2	NaN	HH	ENZ	35.808560	-117.764880	685.0	-0.6850	SCEDC	CI.TOW2..HH	0.285970	0.831533	0.102681
28	CI	TOW2	NaN	HN	ENZ	35.808560	-117.764880	685.0	-0.6850	SCEDC	CI.TOW2..HN	0.285113	0.843886	-0.006698
29	CI	WBM	NaN	BH	ENZ	35.608390	-117.890490	892.0	-0.8920	SCEDC	CI.WBM..BH	0.136927	0.128744	-0.166279
30	CI	WBM	NaN	HH	ENZ	35.608390	-117.890490	892.0	-0.8920	SCEDC	CI.WBM..HH	0.152299	0.124250	-0.138614
31	CI	WBM	NaN	HN	ENZ	35.608390	-117.890490	892.0	-0.8920	SCEDC	CI.WBM..HN	0.170719	0.179461	-0.101436
32	CI	WBM	2C	HN	ENZ	35.608390	-117.890490	892.0	-0.8920	SCEDC	CI.WBM.2C.HN	0.000000	0.000000	0.000000
33	CI	WCS2	NaN	BH	ENZ	36.025210	-117.765260	1143.0	-1.1430	SCEDC	CI.WCS2..BH	0.030349	-0.126935	0.023642
34	CI	WCS2	NaN	HH	ENZ	36.025210	-117.765260	1143.0	-1.1430	SCEDC	CI.WCS2..HH	0.065321	-0.099447	0.146628
35	CI	WCS2	NaN	HN	ENZ	36.025210	-117.765260	1143.0	-1.1430	SCEDC	CI.WCS2..HN	0.061651	-0.096450	0.037563
36	CI	WMF	NaN	BH	ENZ	36.117580	-117.854860	1537.4	-1.5374	SCEDC	CI.WMF..BH	0.039416	-0.005761	-0.165183
37	CI	WMF	NaN	HH	ENZ	36.117580	-117.854860	1537.4	-1.5374	SCEDC	CI.WMF..HH	0.075658	0.005327	-0.079900
38	CI	WMF	NaN	HN	ENZ	36.117580	-117.854860	1537.4	-1.5374	SCEDC	CI.WMF..HN	0.085427	0.025273	-0.240971
39	CI	WMF	2C	HN	ENZ	36.117580	-117.854860	1537.4	-1.5374	SCEDC	CI.WMF.2C.HN	0.000000	0.000000	0.000000
40	CI	WNM	NaN	EH	Z	35.842200	-117.906160	974.3	-0.9743	SCEDC	CI.WNM..EH	-0.026889	-0.070269	-0.332510
41	CI	WNM	NaN	HN	ENZ	35.842200	-117.906160	974.3	-0.9743	SCEDC	CI.WNM..HN	-0.011659	-0.118223	-0.038719
42	CI	WNM	2C	HN	ENZ	35.842200	-117.906160	974.3	-0.9743	SCEDC	CI.WNM.2C.HN	0.000000	0.000000	0.000000
43	CI	WRC2	NaN	BH	ENZ	35.947900	-117.650380	943.0	-0.9430	SCEDC	CI.WRC2..BH	-0.023860	-0.043523	0.117833
44	CI	WRC2	NaN	HH	ENZ	35.947900	-117.650380	943.0	-0.9430	SCEDC	CI.WRC2..HH	0.010633	-0.029488	0.165234
45	CI	WRC2	NaN	HN	ENZ	35.947900	-117.650380	943.0	-0.9430	SCEDC	CI.WRC2..HN	0.014658	-0.021367	0.103905
46	CI	WRV2	NaN	EH	Z	36.007740	-117.890400	1070.0	-1.0700	SCEDC	CI.WRV2..EH	-0.003461	-0.154572	-0.355199
47	CI	WRV2	NaN	HN	ENZ	36.007740	-117.890400	1070.0	-1.0700	SCEDC	CI.WRV2..HN	0.017317	-0.137916	-0.273270
48	CI	WRV2	2C	HN	ENZ	36.007740	-117.890400	1070.0	-1.0700	SCEDC	CI.WRV2.2C.HN	0.000000	0.000000	0.000000
49	CI	WVP2	NaN	EH	Z	35.949390	-117.817690	1465.0	-1.4650	SCEDC	CI.WVP2..EH	0.020325	0.008989	-0.341606
50	CI	WVP2	NaN	HN	ENZ	35.949390	-117.817690	1465.0	-1.4650	SCEDC	CI.WVP2..HN	0.024742	-0.103024	-0.089014
51	CI	WVP2	2C	HN	ENZ	35.949390	-117.817690	1465.0	-1.4650	SCEDC	CI.WVP2.2C.HN	0.000000	0.000000	0.000000

The following shows a very rudimentary map of the station network. Look into the cartopy package for more sophisticated maps.

In [6]:

Copied!





_station_meta = station_meta.drop_duplicates("station")

fig, ax = plt.subplots(num="station_network", figsize=(10, 10))
ax.scatter(_station_meta["longitude"], _station_meta["latitude"], marker="v", color="k")
for idx, row in _station_meta.iterrows():
    ax.text(row.longitude + 0.01, row.latitude + 0.01, row.station, va="bottom", ha="left")
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
ax.grid()
ax.set_title("Stations used to build the PhaseNet catalog")
_station_meta = station_meta.drop_duplicates("station")

fig, ax = plt.subplots(num="station_network", figsize=(10, 10))
ax.scatter(_station_meta["longitude"], _station_meta["latitude"], marker="v", color="k")
for idx, row in _station_meta.iterrows():
    ax.text(row.longitude + 0.01, row.latitude + 0.01, row.station, va="bottom", ha="left")
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
ax.grid()
ax.set_title("Stations used to build the PhaseNet catalog")

Out[6]:

Text(0.5, 1.0, 'Stations used to build the PhaseNet catalog')

No description has been provided for this image

In [7]:

Copied!

event_meta = pd.read_csv(os.path.join(DIR_CATALOG, EVENT_FILE))
event_meta
event_meta = pd.read_csv(os.path.join(DIR_CATALOG, EVENT_FILE))
event_meta

Out[7]:

	time	adloc_score	adloc_residual_time	num_picks	magnitude	adloc_residual_amplitude	event_index	longitude	latitude	depth_km
0	2019-07-04 00:46:47.342963596	0.866596	0.057355	35	0.595368	0.165480	6720	-117.882570	36.091088	4.643969
1	2019-07-04 00:55:32.648412579	0.781116	0.203567	16	1.142510	0.170509	17122	-117.799226	35.378160	11.078458
2	2019-07-04 00:56:37.232733104	0.908073	0.086183	42	0.912494	0.166681	5411	-117.880902	36.091986	4.854336
3	2019-07-04 02:00:39.149363202	0.814322	0.036164	15	0.209530	0.092534	17868	-117.866468	36.093520	4.981447
4	2019-07-04 03:05:31.018885833	0.799281	0.080708	11	0.104050	0.156833	25037	-117.846320	36.100386	5.943363
...	...	...	...	...	...	...	...	...	...	...
20898	2019-07-09 23:58:14.298499048	0.933013	0.097816	67	1.543257	0.131523	1907	-117.711811	35.926468	6.652020
20899	2019-07-09 23:58:47.701746285	0.882434	0.090185	73	1.087089	0.140159	2051	-117.604263	35.797450	6.617413
20900	2019-07-09 23:59:05.102247662	0.798047	0.435077	26	1.147040	0.170994	12567	-117.509729	35.692722	12.815041
20901	2019-07-09 23:59:40.257837813	0.971081	0.065523	35	1.161323	0.068586	7726	-117.846289	36.061435	5.224666
20902	2019-07-09 23:59:49.650466544	0.800524	0.023674	15	0.936622	0.095959	18566	-117.896100	36.095886	6.761860

20903 rows × 10 columns

In [8]:

Copied!

picks = pd.read_csv(os.path.join(DIR_CATALOG, PICK_FILE))
picks
picks = pd.read_csv(os.path.join(DIR_CATALOG, PICK_FILE))
picks

Out[8]:

	station_id	phase_index	phase_time	phase_score	phase_type	dt_s	phase_polarity	phase_amplitude	event_index	sp_ratio	event_amplitude	adloc_mask	adloc_residual_time	adloc_residual_amplitude
0	CI.WMF..BH	280874	2019-07-04 00:46:48.759	0.594	P	0.01	0.110	-3.213107	6720	0.955091	3.124152e-07	1	0.003919	0.012320
1	CI.WMF..HH	280881	2019-07-04 00:46:48.818	0.973	P	0.01	0.938	-3.077638	6720	0.948896	3.368390e-07	1	0.026491	0.063669
2	CI.WMF..HN	280881	2019-07-04 00:46:48.818	0.973	P	0.01	0.898	-3.128019	6720	1.598008	2.146277e-07	1	0.017580	0.173702
3	CI.WRV2..EH	280945	2019-07-04 00:46:49.450	0.977	P	0.01	-0.855	-3.464453	6720	1.000000	3.298200e-07	1	0.077194	0.244381
4	CI.WRV2..HN	280945	2019-07-04 00:46:49.450	0.941	P	0.01	-0.906	-3.585863	6720	1.147235	3.908305e-07	1	0.056694	0.041691
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
955700	CI.WRV2..HN	8639355	2019-07-09 23:59:53.550	0.688	S	0.01	-0.022	-3.384471	18566	0.644492	1.973834e-06	1	0.019093	0.028409
955701	CI.JRC2..HH	8639499	2019-07-09 23:59:54.998	0.402	S	0.01	-0.023	-3.446238	18566	0.647843	6.197256e-06	1	0.020364	-0.154404
955702	CI.JRC2..HN	8639499	2019-07-09 23:59:54.998	0.344	S	0.01	-0.016	-3.531800	18566	0.731796	4.075353e-06	1	0.012773	-0.164659
955703	CI.WVP2..EH	8639574	2019-07-09 23:59:55.740	0.314	S	0.01	-0.103	-3.488117	18566	0.636005	2.815148e-06	1	-0.092233	0.316712
955704	CI.WVP2..HN	8639576	2019-07-09 23:59:55.760	0.695	S	0.01	0.076	-3.218676	18566	0.825223	4.657352e-06	1	0.039319	0.332722

955705 rows × 14 columns

Pick one event¶

Let's use the PhaseNet catalog to read the waveforms of an event.

In [9]:

Copied!





def fetch_event_waveforms(
    event_picks,
    dir_waveforms=DIR_WAVEFORMS,
    time_before_phase_onset_sec=2.0,
    duration_sec=10.0
    ):
    """
    Fetches the waveforms for a given event based on the picks.

    Parameters
    ----------
    event_picks : pandas.DataFrame
        DataFrame containing the picks for the event.
    dir_waveforms : str, optional
        Directory where the waveform data is stored, by default DIR_WAVEFORMS.
    time_before_phase_onset_sec : float, optional
        Time in seconds to start the waveform before the phase onset, by default 2.0.
    duration_sec : float, optional
        Duration in seconds of the waveform to fetch, by default 10.0.

    Returns
    -------
    obspy.Stream
        Stream object containing the fetched waveforms.
    """
    stream = obs.Stream()
    for _, pick in event_picks.iterrows():
        # check whether we have a miniseed file for this waveform
        if pick.phase_type == "P":
            files = glob.glob(os.path.join(dir_waveforms, pick.station_id + "Z*mseed"))
        elif pick.phase_type == "S":
            files = glob.glob(os.path.join(dir_waveforms, pick.station_id + "[N,E]*mseed"))
        starttime = obs.UTCDateTime(pick.phase_time) - time_before_phase_onset_sec
        endtime = starttime + duration_sec
        for _file in files:
            stream += obs.read(
                _file,
                starttime=starttime,
                endtime=endtime
            )
    return stream
    
def fetch_event_waveforms(
    event_picks,
    dir_waveforms=DIR_WAVEFORMS,
    time_before_phase_onset_sec=2.0,
    duration_sec=10.0
    ):
    """
    Fetches the waveforms for a given event based on the picks.

    Parameters
    ----------
    event_picks : pandas.DataFrame
        DataFrame containing the picks for the event.
    dir_waveforms : str, optional
        Directory where the waveform data is stored, by default DIR_WAVEFORMS.
    time_before_phase_onset_sec : float, optional
        Time in seconds to start the waveform before the phase onset, by default 2.0.
    duration_sec : float, optional
        Duration in seconds of the waveform to fetch, by default 10.0.

    Returns
    -------
    obspy.Stream
        Stream object containing the fetched waveforms.
    """
    stream = obs.Stream()
    for _, pick in event_picks.iterrows():
        # check whether we have a miniseed file for this waveform
        if pick.phase_type == "P":
            files = glob.glob(os.path.join(dir_waveforms, pick.station_id + "Z*mseed"))
        elif pick.phase_type == "S":
            files = glob.glob(os.path.join(dir_waveforms, pick.station_id + "[N,E]*mseed"))
        starttime = obs.UTCDateTime(pick.phase_time) - time_before_phase_onset_sec
        endtime = starttime + duration_sec
        for _file in files:
            stream += obs.read(
                _file,
                starttime=starttime,
                endtime=endtime
            )
    return stream
    

In [10]:

Copied!

# explore event_meta to find a nice intermediate-size earthquake we could plot
event_meta.head(20)
# explore event_meta to find a nice intermediate-size earthquake we could plot
event_meta.head(20)

Out[10]:

	time	adloc_score	adloc_residual_time	num_picks	magnitude	adloc_residual_amplitude	event_index	longitude	latitude	depth_km
0	2019-07-04 00:46:47.342963596	0.866596	0.057355	35	0.595368	0.165480	6720	-117.882570	36.091088	4.643969
1	2019-07-04 00:55:32.648412579	0.781116	0.203567	16	1.142510	0.170509	17122	-117.799226	35.378160	11.078458
2	2019-07-04 00:56:37.232733104	0.908073	0.086183	42	0.912494	0.166681	5411	-117.880902	36.091986	4.854336
3	2019-07-04 02:00:39.149363202	0.814322	0.036164	15	0.209530	0.092534	17868	-117.866468	36.093520	4.981447
4	2019-07-04 03:05:31.018885833	0.799281	0.080708	11	0.104050	0.156833	25037	-117.846320	36.100386	5.943363
5	2019-07-04 03:20:28.674438914	0.755451	0.050436	44	0.869927	0.111885	6929	-117.673684	36.114308	5.894466
6	2019-07-04 04:03:01.619369274	0.897311	0.143631	32	0.386310	0.214381	7607	-117.805077	36.016063	0.396071
7	2019-07-04 05:16:47.223353119	0.835094	0.053522	22	0.575433	0.163675	13098	-117.879256	36.090409	4.268292
8	2019-07-04 06:57:32.758991812	0.922386	0.065846	23	0.256281	0.065212	15698	-117.867587	36.081665	5.939931
9	2019-07-04 11:51:07.805591259	0.692042	0.075023	48	0.818316	0.102087	4380	-117.671909	36.118573	6.085804
10	2019-07-04 15:36:04.228420696	0.652888	0.012137	9	-0.633904	0.167058	27370	-117.785299	36.005578	3.234981
11	2019-07-04 15:42:47.932558745	0.597856	0.072740	28	0.933088	0.151623	9740	-117.501116	35.707382	14.496446
12	2019-07-04 16:07:20.003321194	0.729246	0.093399	33	0.835854	0.133106	9365	-117.491503	35.711241	13.846898
13	2019-07-04 16:11:46.920083440	0.711579	0.516367	18	1.734816	0.117952	7098	-117.877675	35.196445	31.000000
14	2019-07-04 16:13:43.094792540	0.849673	0.070048	83	1.653859	0.111108	2305	-117.493716	35.710090	13.375377
15	2019-07-04 16:16:07.085486307	0.814365	0.040161	10	0.587583	0.099619	26520	-117.540198	35.689022	14.537368
16	2019-07-04 17:02:55.057058245	0.770696	0.079675	90	4.476724	0.143489	1101	-117.495094	35.711607	13.586411
17	2019-07-04 17:04:02.231614981	0.664676	0.064038	41	2.062226	0.186214	5379	-117.488446	35.711139	14.001705
18	2019-07-04 17:05:05.071421677	0.509088	0.089882	29	1.471434	0.157048	12115	-117.491307	35.710881	13.223374
19	2019-07-04 17:08:51.664841725	0.666790	0.046444	15	0.657653	0.175270	23044	-117.504506	35.706123	14.570921

In [11]:

Copied!

# feel free to play with the event index to plot different events
EVENT_IDX = 1101

event_meta.set_index("event_index").loc[EVENT_IDX]
# feel free to play with the event index to plot different events
EVENT_IDX = 1101

event_meta.set_index("event_index").loc[EVENT_IDX]

Out[11]:

time                        2019-07-04 17:02:55.057058245
adloc_score                                      0.770696
adloc_residual_time                              0.079675
num_picks                                              90
magnitude                                        4.476724
adloc_residual_amplitude                         0.143489
longitude                                     -117.495094
latitude                                        35.711607
depth_km                                        13.586411
Name: 1101, dtype: object

In [12]:

Copied!

# fetch the corresponding picks for this event
event_picks = picks[picks["event_index"] == EVENT_IDX]
event_picks
# fetch the corresponding picks for this event
event_picks = picks[picks["event_index"] == EVENT_IDX]
event_picks

Out[12]:

	station_id	phase_index	phase_time	phase_score	phase_type	dt_s	phase_polarity	phase_amplitude	event_index	sp_ratio	event_amplitude	adloc_mask	adloc_residual_time	adloc_residual_amplitude
565	CI.CLC..HN	6137848	2019-07-04 17:02:58.488	0.969	P	0.01	-0.879	-0.511873	1101	1.017569	2.474507e-07	1	-0.010048	-0.054426
566	CI.CLC..BH	6137847	2019-07-04 17:02:58.489	0.633	P	0.01	-0.324	-0.666956	1101	1.014822	2.022085e-07	1	0.048153	-0.127168
567	CI.CLC..HH	6137849	2019-07-04 17:02:58.498	0.965	P	0.01	-0.883	-0.586030	1101	1.086886	2.141392e-07	1	-0.007135	-0.236796
568	CI.SRT..HN	6137990	2019-07-04 17:02:59.908	0.879	P	0.01	0.699	-0.643401	1101	1.377537	4.387864e-07	1	-0.069463	0.065039
569	CI.SRT..HH	6137990	2019-07-04 17:02:59.908	0.891	P	0.01	0.734	-0.601019	1101	0.729745	5.910421e-07	1	-0.078144	0.020251
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
650	CI.WMF..HH	6139185	2019-07-04 17:03:11.858	0.672	S	0.01	0.021	-1.368759	1101	1.096820	3.073536e-07	1	0.140179	-0.329117
651	CI.WMF..HN	6139189	2019-07-04 17:03:11.898	0.633	S	0.01	0.026	-1.371407	1101	0.989939	2.891483e-07	1	0.159858	-0.171350
652	CI.DTP..BH	6139200	2019-07-04 17:03:12.019	0.734	S	0.01	0.035	-1.322484	1101	1.385395	1.624267e-07	1	0.068940	0.175198
653	CI.DTP..HH	6139203	2019-07-04 17:03:12.038	0.707	S	0.01	-0.015	-1.306977	1101	1.122809	3.205840e-07	1	0.052824	0.125587
654	CI.DTP..HN	6139211	2019-07-04 17:03:12.118	0.625	S	0.01	0.043	-1.282829	1101	0.863380	2.702479e-07	1	0.105026	0.211253

90 rows × 14 columns

In [13]:

Copied!

# fetch the waveforms
event_waveforms = fetch_event_waveforms(event_picks, time_before_phase_onset_sec=10., duration_sec=30.)
print(event_waveforms.__str__(extended=True))
# fetch the waveforms
event_waveforms = fetch_event_waveforms(event_picks, time_before_phase_onset_sec=10., duration_sec=30.)
print(event_waveforms.__str__(extended=True))

42 Trace(s) in Stream:
CI.CLC..HHZ  | 2019-07-04T17:02:48.478300Z - 2019-07-04T17:03:18.478300Z | 25.0 Hz, 751 samples
CI.SRT..HHZ  | 2019-07-04T17:02:49.918300Z - 2019-07-04T17:03:19.918300Z | 25.0 Hz, 751 samples
CI.CCC..HHZ  | 2019-07-04T17:02:50.118300Z - 2019-07-04T17:03:20.118300Z | 25.0 Hz, 751 samples
CI.SLA..HHZ  | 2019-07-04T17:02:50.518300Z - 2019-07-04T17:03:20.518300Z | 25.0 Hz, 751 samples
CI.TOW2..HHZ | 2019-07-04T17:02:50.518300Z - 2019-07-04T17:03:20.518300Z | 25.0 Hz, 751 samples
CI.LRL..HHZ  | 2019-07-04T17:02:50.638300Z - 2019-07-04T17:03:20.638300Z | 25.0 Hz, 751 samples
CI.WRC2..HHZ | 2019-07-04T17:02:50.718300Z - 2019-07-04T17:03:20.718300Z | 25.0 Hz, 751 samples
CI.CLC..HHN  | 2019-07-04T17:02:50.958300Z - 2019-07-04T17:03:20.958300Z | 25.0 Hz, 751 samples
CI.CLC..HHE  | 2019-07-04T17:02:50.958300Z - 2019-07-04T17:03:20.958300Z | 25.0 Hz, 751 samples
CI.MPM..HHZ  | 2019-07-04T17:02:52.038300Z - 2019-07-04T17:03:22.038300Z | 25.0 Hz, 751 samples
CI.WBM..HHZ  | 2019-07-04T17:02:52.123100Z - 2019-07-04T17:03:22.123100Z | 25.0 Hz, 751 samples
CI.WVP2..EHZ | 2019-07-04T17:02:52.160000Z - 2019-07-04T17:03:22.160000Z | 25.0 Hz, 751 samples
CI.WNM..EHZ  | 2019-07-04T17:02:52.160000Z - 2019-07-04T17:03:22.160000Z | 25.0 Hz, 751 samples
CI.JRC2..HHZ | 2019-07-04T17:02:52.558300Z - 2019-07-04T17:03:22.558300Z | 25.0 Hz, 751 samples
CI.WCS2..HHZ | 2019-07-04T17:02:52.598300Z - 2019-07-04T17:03:22.598300Z | 25.0 Hz, 751 samples
CI.WRV2..EHZ | 2019-07-04T17:02:53.600000Z - 2019-07-04T17:03:23.600000Z | 25.0 Hz, 751 samples
CI.SRT..HHN  | 2019-07-04T17:02:53.838300Z - 2019-07-04T17:03:23.838300Z | 25.0 Hz, 751 samples
CI.SRT..HHE  | 2019-07-04T17:02:53.838300Z - 2019-07-04T17:03:23.838300Z | 25.0 Hz, 751 samples
CI.CCC..HHN  | 2019-07-04T17:02:54.078300Z - 2019-07-04T17:03:24.078300Z | 25.0 Hz, 751 samples
CI.CCC..HHE  | 2019-07-04T17:02:54.078300Z - 2019-07-04T17:03:24.078300Z | 25.0 Hz, 751 samples
CI.SLA..HHE  | 2019-07-04T17:02:54.638300Z - 2019-07-04T17:03:24.638300Z | 25.0 Hz, 751 samples
CI.SLA..HHN  | 2019-07-04T17:02:54.638300Z - 2019-07-04T17:03:24.638300Z | 25.0 Hz, 751 samples
CI.LRL..HHN  | 2019-07-04T17:02:54.718300Z - 2019-07-04T17:03:24.718300Z | 25.0 Hz, 751 samples
CI.LRL..HHE  | 2019-07-04T17:02:54.718300Z - 2019-07-04T17:03:24.718300Z | 25.0 Hz, 751 samples
CI.WMF..HHZ  | 2019-07-04T17:02:54.798300Z - 2019-07-04T17:03:24.798300Z | 25.0 Hz, 751 samples
CI.DTP..HHZ  | 2019-07-04T17:02:54.958300Z - 2019-07-04T17:03:24.958300Z | 25.0 Hz, 751 samples
CI.TOW2..HHN | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples
CI.TOW2..HHE | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples
CI.WRC2..HHN | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples
CI.WRC2..HHE | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples
CI.WBM..HHE  | 2019-07-04T17:02:57.243100Z - 2019-07-04T17:03:27.243100Z | 25.0 Hz, 751 samples
CI.WBM..HHN  | 2019-07-04T17:02:57.243100Z - 2019-07-04T17:03:27.243100Z | 25.0 Hz, 751 samples
CI.MPM..HHE  | 2019-07-04T17:02:57.318300Z - 2019-07-04T17:03:27.318300Z | 25.0 Hz, 751 samples
CI.MPM..HHN  | 2019-07-04T17:02:57.318300Z - 2019-07-04T17:03:27.318300Z | 25.0 Hz, 751 samples
CI.JRC2..HHN | 2019-07-04T17:02:57.958300Z - 2019-07-04T17:03:27.958300Z | 25.0 Hz, 751 samples
CI.JRC2..HHE | 2019-07-04T17:02:57.958300Z - 2019-07-04T17:03:27.958300Z | 25.0 Hz, 751 samples
CI.WCS2..HHE | 2019-07-04T17:02:58.118300Z - 2019-07-04T17:03:28.118300Z | 25.0 Hz, 751 samples
CI.WCS2..HHN | 2019-07-04T17:02:58.118300Z - 2019-07-04T17:03:28.118300Z | 25.0 Hz, 751 samples
CI.WMF..HHN  | 2019-07-04T17:03:01.838300Z - 2019-07-04T17:03:31.838300Z | 25.0 Hz, 751 samples
CI.WMF..HHE  | 2019-07-04T17:03:01.838300Z - 2019-07-04T17:03:31.838300Z | 25.0 Hz, 751 samples
CI.DTP..HHN  | 2019-07-04T17:03:02.038300Z - 2019-07-04T17:03:32.038300Z | 25.0 Hz, 751 samples
CI.DTP..HHE  | 2019-07-04T17:03:02.038300Z - 2019-07-04T17:03:32.038300Z | 25.0 Hz, 751 samples

In [14]:

Copied!

# plot them!
fig = event_waveforms.select(component="Z").plot(equal_scale=False)
# plot them!
fig = event_waveforms.select(component="Z").plot(equal_scale=False)

Run template matching¶

We will now use one of the events from the PhaseNet catalog as a template event to detect events with template matching.

In [16]:

Copied!

selected_event_meta = event_meta.set_index("event_index").loc[EVENT_IDX]
selected_event_meta
selected_event_meta = event_meta.set_index("event_index").loc[EVENT_IDX]
selected_event_meta

Out[16]:

time                        2019-07-04 17:02:55.057058245
adloc_score                                      0.770696
adloc_residual_time                              0.079675
num_picks                                              90
magnitude                                        4.476724
adloc_residual_amplitude                         0.143489
longitude                                     -117.495094
latitude                                        35.711607
depth_km                                        13.586411
Name: 1101, dtype: object

In [17]:

Copied!

selected_event_picks = picks[picks["event_index"] == EVENT_IDX]
selected_event_picks
selected_event_picks = picks[picks["event_index"] == EVENT_IDX]
selected_event_picks

Out[17]:

	station_id	phase_index	phase_time	phase_score	phase_type	dt_s	phase_polarity	phase_amplitude	event_index	sp_ratio	event_amplitude	adloc_mask	adloc_residual_time	adloc_residual_amplitude
565	CI.CLC..HN	6137848	2019-07-04 17:02:58.488	0.969	P	0.01	-0.879	-0.511873	1101	1.017569	2.474507e-07	1	-0.010048	-0.054426
566	CI.CLC..BH	6137847	2019-07-04 17:02:58.489	0.633	P	0.01	-0.324	-0.666956	1101	1.014822	2.022085e-07	1	0.048153	-0.127168
567	CI.CLC..HH	6137849	2019-07-04 17:02:58.498	0.965	P	0.01	-0.883	-0.586030	1101	1.086886	2.141392e-07	1	-0.007135	-0.236796
568	CI.SRT..HN	6137990	2019-07-04 17:02:59.908	0.879	P	0.01	0.699	-0.643401	1101	1.377537	4.387864e-07	1	-0.069463	0.065039
569	CI.SRT..HH	6137990	2019-07-04 17:02:59.908	0.891	P	0.01	0.734	-0.601019	1101	0.729745	5.910421e-07	1	-0.078144	0.020251
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
650	CI.WMF..HH	6139185	2019-07-04 17:03:11.858	0.672	S	0.01	0.021	-1.368759	1101	1.096820	3.073536e-07	1	0.140179	-0.329117
651	CI.WMF..HN	6139189	2019-07-04 17:03:11.898	0.633	S	0.01	0.026	-1.371407	1101	0.989939	2.891483e-07	1	0.159858	-0.171350
652	CI.DTP..BH	6139200	2019-07-04 17:03:12.019	0.734	S	0.01	0.035	-1.322484	1101	1.385395	1.624267e-07	1	0.068940	0.175198
653	CI.DTP..HH	6139203	2019-07-04 17:03:12.038	0.707	S	0.01	-0.015	-1.306977	1101	1.122809	3.205840e-07	1	0.052824	0.125587
654	CI.DTP..HN	6139211	2019-07-04 17:03:12.118	0.625	S	0.01	0.043	-1.282829	1101	0.863380	2.702479e-07	1	0.105026	0.211253

90 rows × 14 columns

Read data from same day¶

In [18]:

Copied!





def fetch_day_waveforms(dir_waveforms):
    """
    Fetches the continuous seismograms for a given day.

    Parameters
    ----------
    dir_waveforms : str
        Directory where the waveform data is stored, by default DIR_WAVEFORMS.

    Returns
    -------
    obspy.Stream
        Stream object containing the fetched continuous seismograms.
    """
    stream = obs.Stream()
    files = glob.glob(os.path.join(dir_waveforms, "*mseed"))
    for _file in files:
        stream += obs.read(_file)
    return stream
def fetch_day_waveforms(dir_waveforms):
    """
    Fetches the continuous seismograms for a given day.

    Parameters
    ----------
    dir_waveforms : str
        Directory where the waveform data is stored, by default DIR_WAVEFORMS.

    Returns
    -------
    obspy.Stream
        Stream object containing the fetched continuous seismograms.
    """
    stream = obs.Stream()
    files = glob.glob(os.path.join(dir_waveforms, "*mseed"))
    for _file in files:
        stream += obs.read(_file)
    return stream

In [19]:

Copied!

# first, read the continuous seismograms into an `obspy.Stream`
continuous_seismograms = fetch_day_waveforms(DIR_WAVEFORMS)
print(continuous_seismograms.__str__(extended=True))
# first, read the continuous seismograms into an `obspy.Stream`
continuous_seismograms = fetch_day_waveforms(DIR_WAVEFORMS)
print(continuous_seismograms.__str__(extended=True))

57 Trace(s) in Stream:
CI.WCS2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
PB.B916..EHZ | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples
CI.CLC..HHN  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.WRV2..EHZ | 2019-07-04T00:00:00.000000Z - 2019-07-04T23:59:59.960000Z | 25.0 Hz, 2160000 samples
PB.B916..EH2 | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples
PB.B917..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.WMF..HHZ  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
PB.B918..EHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
PB.B918..EH2 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.TOW2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.SLA..HHE  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.WRC2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.JRC2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.CCC..HHZ  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.DTP..HHN  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.SRT..HHZ  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.LRL..HHZ  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.SLA..HHN  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.MPM..HHZ  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.DTP..HHE  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.WBM..HHZ  | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples
CI.WCS2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.CLC..HHE  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
PB.B921..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.DAW..HHZ  | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples
CI.CLC..HHZ  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.WMF..HHN  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.TOW2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.DAW..HHE  | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples
PB.B918..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.WRC2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.JRC2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.CCC..HHN  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
PB.B916..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.MPM..HHE  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.DTP..HHZ  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.WBM..HHE  | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples
CI.SRT..HHN  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
PB.B917..EHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.WVP2..EHZ | 2019-07-04T00:00:00.000000Z - 2019-07-04T23:59:59.960000Z | 25.0 Hz, 2160000 samples
PB.B917..EH2 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.LRL..HHN  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.CCC..HHE  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.SLA..HHZ  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.WNM..EHZ  | 2019-07-04T00:00:00.000000Z - 2019-07-04T23:59:59.960000Z | 25.0 Hz, 2160000 samples
CI.WRC2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.JRC2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.SRT..HHE  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.LRL..HHE  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.MPM..HHN  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
PB.B921..EHZ | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples
CI.WBM..HHN  | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples
PB.B921..EH2 | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples
CI.WMF..HHE  | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.WCS2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.TOW2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples
CI.DAW..HHN  | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples

In [20]:

Copied!

# plot the continuous seismograms from a single station
fig = continuous_seismograms.select(station="CLC").plot()
# plot the continuous seismograms from a single station
fig = continuous_seismograms.select(station="CLC").plot()

In [21]:

Copied!





# then, cast data into `numpy.ndarray`
station_codes = list(set([st.stats.station for st in continuous_seismograms]))
component_codes = ["N", "E", "Z"]
component_aliases={"E": ["E", "2"], "N": ["N", "1"], "Z": ["Z"]}

num_stations = len(station_codes)
num_channels = len(component_codes)
num_samples = len(continuous_seismograms[0].data)

continuous_seismograms_arr = np.zeros((num_stations, num_channels, num_samples), dtype=np.float32)
for s, sta in enumerate(station_codes):
    for c, cp in enumerate(component_codes):
        for cp_alias in component_aliases[cp]:
            sel_seismogram = continuous_seismograms.select(station=sta, component=cp_alias)
            if len(sel_seismogram) > 0:
                continuous_seismograms_arr[s, c, :] = sel_seismogram[0].data
                break
            
continuous_seismograms_arr
# then, cast data into `numpy.ndarray`
station_codes = list(set([st.stats.station for st in continuous_seismograms]))
component_codes = ["N", "E", "Z"]
component_aliases={"E": ["E", "2"], "N": ["N", "1"], "Z": ["Z"]}

num_stations = len(station_codes)
num_channels = len(component_codes)
num_samples = len(continuous_seismograms[0].data)

continuous_seismograms_arr = np.zeros((num_stations, num_channels, num_samples), dtype=np.float32)
for s, sta in enumerate(station_codes):
    for c, cp in enumerate(component_codes):
        for cp_alias in component_aliases[cp]:
            sel_seismogram = continuous_seismograms.select(station=sta, component=cp_alias)
            if len(sel_seismogram) > 0:
                continuous_seismograms_arr[s, c, :] = sel_seismogram[0].data
                break
            
continuous_seismograms_arr

Out[21]:

array([[[ 4.09647620e-12,  1.35595988e-11,  1.61010198e-11, ...,
          1.87989971e-10,  5.81934279e-10,  1.59266211e-11],
        [ 8.85055397e-12,  1.41080064e-11,  2.29350306e-12, ...,
         -1.22290775e-10,  4.84999090e-11,  1.02444719e-10],
        [-1.44921930e-11, -8.83197541e-13,  5.94794233e-13, ...,
          3.61067509e-10,  1.61199512e-10, -2.93843921e-10]],

       [[-2.02754636e-11,  7.27350949e-12,  2.90683051e-11, ...,
         -1.27593561e-10,  4.12396610e-11,  2.02138389e-10],
        [-8.89091231e-12,  1.46336155e-11,  4.67565708e-11, ...,
         -1.25822366e-10,  5.78710552e-11,  4.38750390e-11],
        [ 5.92738358e-11,  3.67502244e-11, -1.45799206e-10, ...,
         -4.12412986e-11,  2.63505495e-10, -1.57162269e-11]],

       [[ 5.47254013e-11,  7.05789246e-11, -6.25662326e-12, ...,
         -3.65891401e-10, -7.81559206e-10, -6.38798348e-10],
        [-1.39255118e-11,  2.02905037e-11,  1.07095530e-11, ...,
          1.02497244e-09,  7.77831466e-10, -4.58769023e-10],
        [ 1.94260233e-10, -3.34868328e-10, -1.75106818e-10, ...,
          9.01718644e-10, -9.07585604e-11, -1.75413142e-10]],

       ...,

       [[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [-2.38018650e-11,  4.41909790e-12, -4.36377739e-11, ...,
         -4.77496098e-11, -1.39665293e-10, -1.53408883e-10]],

       [[ 4.53147459e-12,  2.15497811e-11,  4.97441671e-12, ...,
          1.37320554e-11, -8.23028538e-12,  2.90706681e-10],
        [-2.32118665e-11, -7.40775150e-12,  4.76967381e-12, ...,
         -5.22246857e-10, -1.66053525e-11,  4.45412207e-10],
        [-6.16842992e-12,  4.56698828e-12,  1.07985522e-11, ...,
          1.94127353e-10,  1.03605340e-10, -3.92332278e-10]],

       [[ 1.00288475e-11, -1.25283534e-11, -8.96174020e-12, ...,
          8.97525543e-12, -6.86870838e-11, -2.98185018e-11],
        [ 1.08289614e-12,  8.16112889e-12, -1.36335960e-11, ...,
          4.46155092e-11,  7.22212498e-11, -3.26238862e-11],
        [-7.30912639e-12,  1.16460912e-11,  5.44300976e-12, ...,
          2.24157377e-11,  3.17579747e-11, -6.67901151e-11]]],
      dtype=float32)

Build template¶

------------------ Background ------------------¶

A template is a collection of waveforms at different channels, $T_{s,c}(t)$, which are clips taken from the continuous seismograms, $u_{s,c}$. These clips are taken at times defined by: $$ u_{s,c}(t)\ |\ t \in \lbrace \tau_{s,c}; \tau_{s,c} + D \rbrace, $$ where $\tau_{s,c}$ is the start time of the template window and $D$ is the template duration.

$\tau_{s,c}$ is given by some prior information on the event: picks or modeled arrival times. The moveouts, $\tilde{\tau}_{s,c}$, are the collection of delay times relative to the earliest $\tau_{s,c}$: $$ \tilde{\tau}_{s,c} = \tau_{s,c} - \underset{s,c}{\min} \lbrace \tau_{s,c} \rbrace .$$

------------------ On the necessity to clip template waveforms out of `numpy.ndarray` instead of `obspy.Stream` ------------------¶

Looking carefully at the output of print(continuous_seismograms.__str__(extended=True)), a few cells before, we see that start times are generally not exactly at midnight. This is a consequence of the discrete nature of the continuous seismograms (here, sampled at 25 samples per second). Thus, in general, the $\tau_{s,c}$ computed from picks or modeled arrival times fall in between two samples of the seismograms.

When running a matched-filter search, we need to make sure the moveouts, $\tilde{\tau}_{s,c}$, ultimately expressed in samples, match exactly the times that were used when clipping the template waveforms out of $u_{s,c}$. One way to ensure this is to first cast the $\tau_{s,c}$ to times in samples and then operate exclusively on the numpy.ndarray: continuous_seismograms_arr:

$$ T_{s,c}[t_n] = u_{s,c}[\tau_{s,c} + n \Delta t],$$ where $\Delta t$ is the sampling time.

Clip out waveforms and moveout and station-weight arrays¶

In [22]:

Copied!





# PHASE_ON_COMP: dictionary defining which moveout we use to extract the waveform.
#                Here, we use windows centered around the S wave for horizontal components
#                and windows starting 1sec before the P wave for the vertical component.
PHASE_ON_COMP = {"N": "S", "E": "S", "Z": "P"}
# OFFSET_PHASE_SEC: dictionary defining the time offset taken before a given phase
#               for example OFFSET_PHASE_SEC["P"] = 1.0 means that we extract the window
#               1 second before the predicted P arrival time
OFFSET_PHASE_SEC = {"P": 1.0, "S": 4.0}
# TEMPLATE_DURATION_SEC
TEMPLATE_DURATION_SEC = 8. 
# SAMPLING_RATE_HZ
SAMPLING_RATE_HZ = 25.
# TEMPLATE_DURATION_SAMP
TEMPLATE_DURATION_SAMP = int(TEMPLATE_DURATION_SEC * SAMPLING_RATE_HZ)
# PHASE_ON_COMP: dictionary defining which moveout we use to extract the waveform.
#                Here, we use windows centered around the S wave for horizontal components
#                and windows starting 1sec before the P wave for the vertical component.
PHASE_ON_COMP = {"N": "S", "E": "S", "Z": "P"}
# OFFSET_PHASE_SEC: dictionary defining the time offset taken before a given phase
#               for example OFFSET_PHASE_SEC["P"] = 1.0 means that we extract the window
#               1 second before the predicted P arrival time
OFFSET_PHASE_SEC = {"P": 1.0, "S": 4.0}
# TEMPLATE_DURATION_SEC
TEMPLATE_DURATION_SEC = 8. 
# SAMPLING_RATE_HZ
SAMPLING_RATE_HZ = 25.
# TEMPLATE_DURATION_SAMP
TEMPLATE_DURATION_SAMP = int(TEMPLATE_DURATION_SEC * SAMPLING_RATE_HZ)

In [23]:

Copied!





# add station_code columns to `selected_event_picks`
selected_event_picks.set_index("station_id", inplace=True)
for staid in selected_event_picks.index:
    station_code = staid.split(".")[1]
    selected_event_picks.loc[staid, "station_code"] = station_code
selected_event_picks
# add station_code columns to `selected_event_picks`
selected_event_picks.set_index("station_id", inplace=True)
for staid in selected_event_picks.index:
    station_code = staid.split(".")[1]
    selected_event_picks.loc[staid, "station_code"] = station_code
selected_event_picks

/tmp/ipykernel_471469/1881751651.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  selected_event_picks.loc[staid, "station_code"] = station_code

Out[23]:

	phase_index	phase_time	phase_score	phase_type	dt_s	phase_polarity	phase_amplitude	event_index	sp_ratio	event_amplitude	adloc_mask	adloc_residual_time	adloc_residual_amplitude	station_code
station_id
CI.CLC..HN	6137848	2019-07-04 17:02:58.488	0.969	P	0.01	-0.879	-0.511873	1101	1.017569	2.474507e-07	1	-0.010048	-0.054426	CLC
CI.CLC..BH	6137847	2019-07-04 17:02:58.489	0.633	P	0.01	-0.324	-0.666956	1101	1.014822	2.022085e-07	1	0.048153	-0.127168	CLC
CI.CLC..HH	6137849	2019-07-04 17:02:58.498	0.965	P	0.01	-0.883	-0.586030	1101	1.086886	2.141392e-07	1	-0.007135	-0.236796	CLC
CI.SRT..HN	6137990	2019-07-04 17:02:59.908	0.879	P	0.01	0.699	-0.643401	1101	1.377537	4.387864e-07	1	-0.069463	0.065039	SRT
CI.SRT..HH	6137990	2019-07-04 17:02:59.908	0.891	P	0.01	0.734	-0.601019	1101	0.729745	5.910421e-07	1	-0.078144	0.020251	SRT
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
CI.WMF..HH	6139185	2019-07-04 17:03:11.858	0.672	S	0.01	0.021	-1.368759	1101	1.096820	3.073536e-07	1	0.140179	-0.329117	WMF
CI.WMF..HN	6139189	2019-07-04 17:03:11.898	0.633	S	0.01	0.026	-1.371407	1101	0.989939	2.891483e-07	1	0.159858	-0.171350	WMF
CI.DTP..BH	6139200	2019-07-04 17:03:12.019	0.734	S	0.01	0.035	-1.322484	1101	1.385395	1.624267e-07	1	0.068940	0.175198	DTP
CI.DTP..HH	6139203	2019-07-04 17:03:12.038	0.707	S	0.01	-0.015	-1.306977	1101	1.122809	3.205840e-07	1	0.052824	0.125587	DTP
CI.DTP..HN	6139211	2019-07-04 17:03:12.118	0.625	S	0.01	0.043	-1.282829	1101	0.863380	2.702479e-07	1	0.105026	0.211253	DTP

90 rows × 14 columns

In the following cell, we build the numpy.ndarray of moveouts $\tilde{\tau}_{s,c}$, expressed in units of samples.

In [24]:

Copied!





# first, we extract the set of relative delay times of the beginning of each
# template window on a given station and component
tau_s_c_sec = np.zeros((num_stations, num_channels), dtype=np.float64)
for s, sta in enumerate(station_codes):
    for c, cp in enumerate(component_codes):
        phase_type = PHASE_ON_COMP[cp]
        picks_s_c = selected_event_picks[
            (
                (selected_event_picks["station_code"] == sta)
                & (selected_event_picks["phase_type"] == phase_type)
            )
        ]
        if len(picks_s_c) == 0:
            # no pick for this station/component: set to -999
            tau_s_c_sec[s, c] = -999
        elif len(picks_s_c) == 1:
            # express pick relative to beginning of day (midnight)
            _pick = pd.Timestamp(picks_s_c["phase_time"])
            _relative_pick_sec = (_pick - pd.Timestamp(_pick.strftime("%Y-%m-%d"))).total_seconds()
            tau_s_c_sec[s, c] = _relative_pick_sec - OFFSET_PHASE_SEC[phase_type]
        else:
            # there were several picks from different channels: average them
            _relative_pick_sec = 0.
            for _pick in picks_s_c["phase_time"].values:
                _pick = pd.Timestamp(_pick)
                _relative_pick_sec += (_pick - pd.Timestamp(_pick.strftime("%Y-%m-%d"))).total_seconds()
            _relative_pick_sec /= float(len(picks_s_c["phase_time"]))
            tau_s_c_sec[s, c] = _relative_pick_sec - OFFSET_PHASE_SEC[phase_type]
# now, we convert these relative times into samples 
# and express them relative to the earliest time
# we also store in memory the minimum time offset `tau_min_samp` for the next step
moveouts_samp_arr = (tau_s_c_sec * SAMPLING_RATE_HZ).astype(np.int64)
tau_min_samp = np.min(moveouts_samp_arr[moveouts_samp_arr > 0])
moveouts_samp_arr = moveouts_samp_arr - tau_min_samp
moveouts_samp_arr
# first, we extract the set of relative delay times of the beginning of each
# template window on a given station and component
tau_s_c_sec = np.zeros((num_stations, num_channels), dtype=np.float64)
for s, sta in enumerate(station_codes):
    for c, cp in enumerate(component_codes):
        phase_type = PHASE_ON_COMP[cp]
        picks_s_c = selected_event_picks[
            (
                (selected_event_picks["station_code"] == sta)
                & (selected_event_picks["phase_type"] == phase_type)
            )
        ]
        if len(picks_s_c) == 0:
            # no pick for this station/component: set to -999
            tau_s_c_sec[s, c] = -999
        elif len(picks_s_c) == 1:
            # express pick relative to beginning of day (midnight)
            _pick = pd.Timestamp(picks_s_c["phase_time"])
            _relative_pick_sec = (_pick - pd.Timestamp(_pick.strftime("%Y-%m-%d"))).total_seconds()
            tau_s_c_sec[s, c] = _relative_pick_sec - OFFSET_PHASE_SEC[phase_type]
        else:
            # there were several picks from different channels: average them
            _relative_pick_sec = 0.
            for _pick in picks_s_c["phase_time"].values:
                _pick = pd.Timestamp(_pick)
                _relative_pick_sec += (_pick - pd.Timestamp(_pick.strftime("%Y-%m-%d"))).total_seconds()
            _relative_pick_sec /= float(len(picks_s_c["phase_time"]))
            tau_s_c_sec[s, c] = _relative_pick_sec - OFFSET_PHASE_SEC[phase_type]
# now, we convert these relative times into samples 
# and express them relative to the earliest time
# we also store in memory the minimum time offset `tau_min_samp` for the next step
moveouts_samp_arr = (tau_s_c_sec * SAMPLING_RATE_HZ).astype(np.int64)
tau_min_samp = np.min(moveouts_samp_arr[moveouts_samp_arr > 0])
moveouts_samp_arr = moveouts_samp_arr - tau_min_samp
moveouts_samp_arr

Out[24]:

array([[-1559399, -1559399, -1559399],
       [-1559399, -1559399, -1559399],
       [     156,      156,      104],
       [      78,       78,       53],
       [-1559399, -1559399, -1559399],
       [     225,      225,      140],
       [-1559399, -1559399, -1559399],
       [     101,      101,       68],
       [     272,      272,      171],
       [     277,      277,      174],
       [      91,       91,       62],
       [     179,      179,      116],
       [     160,      160,      105],
       [       0,        0,       13],
       [     175,      175,      115],
       [     102,      102,       64],
       [      94,       94,       66],
       [      72,       72,       48],
       [     165,      165,      105],
       [     159,      159,      102],
       [-1559399, -1559399, -1559399]])

Next, we use the moveouts, in samples, to clip out the relevant template waveforms from the continuous seismograms.

In [25]:

Copied!





template_waveforms_arr = np.zeros((num_stations, num_channels, TEMPLATE_DURATION_SAMP), dtype=np.float32)
weights_arr = np.ones((num_stations, num_channels), dtype=np.float32)

for s, sta in enumerate(station_codes):
    for c, cp in enumerate(component_codes):
        if moveouts_samp_arr[s, c] < 0:
            # no picks were found on this station
            weights_arr[s, c] = 0.
            continue
        starttime = tau_min_samp + moveouts_samp_arr[s, c]
        endtime = starttime + TEMPLATE_DURATION_SAMP
        template_waveforms_arr[s, c, :] = continuous_seismograms_arr[s, c, starttime:endtime]
        if template_waveforms_arr[s, c, :].sum() == 0.:
            # no data was available on this channel
            weights_arr[s, c] = 0.
        
template_waveforms_arr
template_waveforms_arr = np.zeros((num_stations, num_channels, TEMPLATE_DURATION_SAMP), dtype=np.float32)
weights_arr = np.ones((num_stations, num_channels), dtype=np.float32)

for s, sta in enumerate(station_codes):
    for c, cp in enumerate(component_codes):
        if moveouts_samp_arr[s, c] < 0:
            # no picks were found on this station
            weights_arr[s, c] = 0.
            continue
        starttime = tau_min_samp + moveouts_samp_arr[s, c]
        endtime = starttime + TEMPLATE_DURATION_SAMP
        template_waveforms_arr[s, c, :] = continuous_seismograms_arr[s, c, starttime:endtime]
        if template_waveforms_arr[s, c, :].sum() == 0.:
            # no data was available on this channel
            weights_arr[s, c] = 0.
        
template_waveforms_arr

Out[25]:

array([[[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00]],

       [[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00]],

       [[-1.15102615e-04, -2.08503159e-04, -8.57295672e-05, ...,
         -3.02342505e-05,  1.90505059e-04,  2.32697013e-04],
        [-6.95939161e-06, -1.14591476e-05, -4.32817069e-05, ...,
          1.95618981e-04, -6.08503397e-05, -1.64061450e-04],
        [-3.15401110e-08,  6.19310697e-07,  1.24642884e-06, ...,
         -6.75416959e-05,  9.95227219e-06,  9.39235106e-05]],

       ...,

       [[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 6.11326456e-08,  1.84645728e-07,  2.68150814e-07, ...,
         -1.40672455e-05,  9.12318046e-06,  1.10272435e-06]],

       [[ 9.79182732e-05,  1.20548120e-05, -6.78718134e-05, ...,
         -4.39741780e-05, -8.56743718e-05, -4.07844163e-05],
        [-1.06544954e-04, -2.09494847e-05,  3.22320338e-05, ...,
         -2.05596989e-05, -2.35213724e-06,  1.43820762e-05],
        [ 7.50264846e-08, -1.39887931e-07, -4.09550466e-07, ...,
          2.37395070e-05, -6.56054617e-05,  1.63327525e-06]],

       [[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00]]],
      dtype=float32)

In [64]:

Copied!





fig, axes = plt.subplots(num="template_waveforms", nrows=num_stations, ncols=num_channels, figsize=(16, 20), sharex=True)
for s, sta in enumerate(station_codes):
    for c, cp in enumerate(component_codes):
        time_sec = (max(0., moveouts_samp_arr[s, c]) + np.arange(TEMPLATE_DURATION_SAMP)) / SAMPLING_RATE_HZ
        axes[s, c].plot(
            time_sec,
            template_waveforms_arr[s, c, :],
            color="k",
            lw=0.75
        )
        axes[s, c].text(0.02, 0.96, sta + " " + cp, ha="left", va="top", transform=axes[s, c].transAxes)
        # axes[s, c].set_title(sta + " " + cp)
        if s == num_stations - 1:
            axes[s, c].set_xlabel("Time (s)")
        if c == 0:
            axes[s, c].set_ylabel("Amp.")
        axes[s, c].grid()
# plt.tight_layout()
fig, axes = plt.subplots(num="template_waveforms", nrows=num_stations, ncols=num_channels, figsize=(16, 20), sharex=True)
for s, sta in enumerate(station_codes):
    for c, cp in enumerate(component_codes):
        time_sec = (max(0., moveouts_samp_arr[s, c]) + np.arange(TEMPLATE_DURATION_SAMP)) / SAMPLING_RATE_HZ
        axes[s, c].plot(
            time_sec,
            template_waveforms_arr[s, c, :],
            color="k",
            lw=0.75
        )
        axes[s, c].text(0.02, 0.96, sta + " " + cp, ha="left", va="top", transform=axes[s, c].transAxes)
        # axes[s, c].set_title(sta + " " + cp)
        if s == num_stations - 1:
            axes[s, c].set_xlabel("Time (s)")
        if c == 0:
            axes[s, c].set_ylabel("Amp.")
        axes[s, c].grid()
# plt.tight_layout()

In [ ]:

Copied!





# normalize template waveforms for numerical reasons
norm = np.std(template_waveforms_arr, axis=-1, keepdims=True)
norm[norm == 0.] = 1. 
template_waveforms_arr /= norm

# normalize weights so that they sum up to one
weights_arr /= np.sum(weights_arr)

# normalize continuous seismograms for numerical reasons
norm = np.std(continuous_seismograms_arr, axis=-1, keepdims=True)
norm[norm == 0.] = 1. 
continuous_seismograms_arr /= norm
# normalize template waveforms for numerical reasons
norm = np.std(template_waveforms_arr, axis=-1, keepdims=True)
norm[norm == 0.] = 1. 
template_waveforms_arr /= norm

# normalize weights so that they sum up to one
weights_arr /= np.sum(weights_arr)

# normalize continuous seismograms for numerical reasons
norm = np.std(continuous_seismograms_arr, axis=-1, keepdims=True)
norm[norm == 0.] = 1. 
continuous_seismograms_arr /= norm

Run FMF¶

After all this data formatting, we can now run template matching (also called matched-filtering) to detect new events that are similar to our template event.

For that, use the software Fast Matched Filter (FMF): https://github.com/beridel/fast_matched_filter

FMF offers C and CUDA-C routines to efficiently run template matching on CPUs, or even on GPUs if available to you.

In [27]:

Copied!





# FMF_STEP_SAMP: this is the step between two consecutive calculation of the correlation coefficient
FMF_STEP_SAMP = 1
# ARCH: it determines whether you want to use GPUs or CPUs 
#       If you do not have an Nvidia GPU, set ARCH = "cpu"
ARCH = "gpu"
# FMF_STEP_SAMP: this is the step between two consecutive calculation of the correlation coefficient
FMF_STEP_SAMP = 1
# ARCH: it determines whether you want to use GPUs or CPUs 
#       If you do not have an Nvidia GPU, set ARCH = "cpu"
ARCH = "gpu"

The following cell computes the time series of correlation coefficients between the template waveforms, $T_{s,c}$, and the continuous seismograms, $u_{s,c}$: $$ CC(t) = \sum_{s,c} w_{s,c} \sum_{i=1}^N \dfrac{T^*_{s,c}(n \Delta t) u^*_{s,c}(t + \tilde{\tau}_{s,c} + n \Delta t)}{\sqrt{\sum_{i=1}^N {T^*_{s,c}}^2(n \Delta t) \sum_{i=1}^N {u^*_{s,c}}^2(t + \tilde{\tau}_{s,c} + n \Delta t)}}, $$ with:

$T^*_{s,c} = T_{s,c} - \frac{1}{N} \sum_{i=1}^N T_{s,c}(n \Delta t)$,
$u^*_{s,c}(t) = u_{s,c}(t) - \frac{1}{N} \sum_{i=1}^N u_{s,c}(t + n \Delta t)$.

Note that because the seismograms were filtered below periods that are shorter than the template window ($N \Delta t$) we have $T^*_{s,c} \approx T_{s,c}$ and $u^*_{s,c} \approx u_{s,c}$, which spares us the computation of the mean in each sliding window.

In [28]:

Copied!





cc = fmf.matched_filter(
    template_waveforms_arr.astype(np.float32),
    moveouts_samp_arr.astype(np.int32),
    weights_arr.astype(np.float32),
    continuous_seismograms_arr.astype(np.float32),
    FMF_STEP_SAMP,
    arch=ARCH,
)
cc = fmf.matched_filter(
    template_waveforms_arr.astype(np.float32),
    moveouts_samp_arr.astype(np.int32),
    weights_arr.astype(np.float32),
    continuous_seismograms_arr.astype(np.float32),
    FMF_STEP_SAMP,
    arch=ARCH,
)

In [29]:

Copied!

# FMF is programmed to handle multiple templates at once. Here, we only used
# a single template, hence the size of the outermost axis of "1"
cc.shape
# FMF is programmed to handle multiple templates at once. Here, we only used
# a single template, hence the size of the outermost axis of "1"
cc.shape

Out[29]:

(1, 2159801)

In [37]:

Copied!





# let's print the output of our template matching run, which a time series of network-averaged correlation coefficients
# of same duration as the continuous seismograms
_cc = cc[0, :]
time_cc = np.arange(len(_cc)) / SAMPLING_RATE_HZ

fig = plt.figure("network_averaged_cc", figsize=(20, 6))
gs = fig.add_gridspec(ncols=4)

ax1 = fig.add_subplot(gs[:3])
ax1.plot(time_cc, _cc, lw=0.75)
ax1.set_xlabel("Elapsed time (sec)")
ax1.set_ylabel("Network-averaged cc")
ax1.set_xlim(time_cc.min(), time_cc.max())
ax1.set_title("Time series of template/continuous seismograms similarity")

ax2 = fig.add_subplot(gs[3], sharey=ax1)
_ = ax2.hist(_cc, orientation="horizontal", bins=250, density=True)
ax2.set_xlabel("Normalized count")
ax2.set_title("Histogram")

for ax in [ax1, ax2]:
    ax.grid()
# let's print the output of our template matching run, which a time series of network-averaged correlation coefficients
# of same duration as the continuous seismograms
_cc = cc[0, :]
time_cc = np.arange(len(_cc)) / SAMPLING_RATE_HZ

fig = plt.figure("network_averaged_cc", figsize=(20, 6))
gs = fig.add_gridspec(ncols=4)

ax1 = fig.add_subplot(gs[:3])
ax1.plot(time_cc, _cc, lw=0.75)
ax1.set_xlabel("Elapsed time (sec)")
ax1.set_ylabel("Network-averaged cc")
ax1.set_xlim(time_cc.min(), time_cc.max())
ax1.set_title("Time series of template/continuous seismograms similarity")

ax2 = fig.add_subplot(gs[3], sharey=ax1)
_ = ax2.hist(_cc, orientation="horizontal", bins=250, density=True)
ax2.set_xlabel("Normalized count")
ax2.set_title("Histogram")

for ax in [ax1, ax2]:
    ax.grid()

Set detection threshold and find events¶

We will use the time series of correlation coefficients to build an earthquake catalog. For that, we need to set a detection threshold and define all times above that threshold as triggers caused by near-repeats of the template event.

In [32]:

Copied!





def select_cc_indexes(
    cc_t,
    threshold,
    search_win,
):
    """Select the peaks in the CC time series.

    Parameters
    ------------
    cc_t: (n_corr,) numpy.ndarray
        The CC time series for one template.
    threshold: (n_corr,) numpy.ndarray or scalar
        The detection threshold.
    search_win: scalar int
        The minimum inter-event time, in units of correlation step.


    Returns
    --------
    cc_idx: (n_detections,) numpy.ndarray
        The list of all selected CC indexes. They give the timings of the
        detected events.
    """

    cc_detections = cc_t > threshold
    cc_idx = np.where(cc_detections)[0]

    cc_idx = list(cc_idx)
    n_rm = 0
    for i in range(1, len(cc_idx)):
        if (cc_idx[i - n_rm] - cc_idx[i - n_rm - 1]) < search_win:
            if cc_t[cc_idx[i - n_rm]] > cc_t[cc_idx[i - n_rm - 1]]:
                # keep (i-n_rm)-th detection
                cc_idx.remove(cc_idx[i - n_rm - 1])
            else:
                # keep (i-n_rm-1)-th detection
                cc_idx.remove(cc_idx[i - n_rm])
            n_rm += 1
    cc_idx = np.asarray(cc_idx)
    return cc_idx
    
def select_cc_indexes(
    cc_t,
    threshold,
    search_win,
):
    """Select the peaks in the CC time series.

    Parameters
    ------------
    cc_t: (n_corr,) numpy.ndarray
        The CC time series for one template.
    threshold: (n_corr,) numpy.ndarray or scalar
        The detection threshold.
    search_win: scalar int
        The minimum inter-event time, in units of correlation step.


    Returns
    --------
    cc_idx: (n_detections,) numpy.ndarray
        The list of all selected CC indexes. They give the timings of the
        detected events.
    """

    cc_detections = cc_t > threshold
    cc_idx = np.where(cc_detections)[0]

    cc_idx = list(cc_idx)
    n_rm = 0
    for i in range(1, len(cc_idx)):
        if (cc_idx[i - n_rm] - cc_idx[i - n_rm - 1]) < search_win:
            if cc_t[cc_idx[i - n_rm]] > cc_t[cc_idx[i - n_rm - 1]]:
                # keep (i-n_rm)-th detection
                cc_idx.remove(cc_idx[i - n_rm - 1])
            else:
                # keep (i-n_rm-1)-th detection
                cc_idx.remove(cc_idx[i - n_rm])
            n_rm += 1
    cc_idx = np.asarray(cc_idx)
    return cc_idx
    

In [35]:

Copied!





# INTEREVENT_TIME_RESOLUTION_SEC: In some cases, a template might trigger multiple, closely spaced detections because
#                                 of a phenomenon similar to that of "cycle skipping", where the waveform correlates
#                                 well with a time-shifted version of itself. Thus, to avoid redundant detections, we
#                                 set a minimum time separation between triggers (rule of thumb: about half the template duration)
INTEREVENT_TIME_RESOLUTION_SEC = 5.
INTEREVENT_TIME_RESOLUTION_SAMP = int(INTEREVENT_TIME_RESOLUTION_SEC * SAMPLING_RATE_HZ)
_cc = cc[0, :]
time_cc = np.arange(len(_cc)) * FMF_STEP_SAMP / SAMPLING_RATE_HZ
NUM_RMS = 8.
detection_threshold = NUM_RMS * np.std(_cc)
event_cc_indexes = select_cc_indexes(_cc, detection_threshold, INTEREVENT_TIME_RESOLUTION_SAMP)

fig = plt.figure("network_averaged_cc_w_threshold", figsize=(20, 6))
gs = fig.add_gridspec(ncols=4)

ax1 = fig.add_subplot(gs[:3])
ax1.plot(time_cc, _cc, lw=0.75)
ax1.scatter(time_cc[event_cc_indexes], _cc[event_cc_indexes], linewidths=0.25, edgecolor="k", color="r", zorder=2)
ax1.set_xlabel("Elapsed time (sec)")
ax1.set_ylabel("Network-averaged cc")
ax1.set_xlim(time_cc.min(), time_cc.max())
ax1.set_title("Time series of template/continuous seismograms similarity")

ax2 = fig.add_subplot(gs[3], sharey=ax1)
_ = ax2.hist(_cc, orientation="horizontal", bins=250, density=True, zorder=2)
ax2.set_xlabel("Normalized count")
ax2.set_title("Histogram")

label = f"Detection threshold: {NUM_RMS:.0f}"r"$\times \mathrm{RMS}(\mathrm{CC}(t))$"f"\n{len(event_cc_indexes):d} detected events"
for ax in [ax1, ax2]:
    ax.grid()
    ax.axhline(
        detection_threshold, ls="--", color="r",
        label=label
        )
ax1.legend(loc="upper left")
# INTEREVENT_TIME_RESOLUTION_SEC: In some cases, a template might trigger multiple, closely spaced detections because
#                                 of a phenomenon similar to that of "cycle skipping", where the waveform correlates
#                                 well with a time-shifted version of itself. Thus, to avoid redundant detections, we
#                                 set a minimum time separation between triggers (rule of thumb: about half the template duration)
INTEREVENT_TIME_RESOLUTION_SEC = 5.
INTEREVENT_TIME_RESOLUTION_SAMP = int(INTEREVENT_TIME_RESOLUTION_SEC * SAMPLING_RATE_HZ)
_cc = cc[0, :]
time_cc = np.arange(len(_cc)) * FMF_STEP_SAMP / SAMPLING_RATE_HZ
NUM_RMS = 8.
detection_threshold = NUM_RMS * np.std(_cc)
event_cc_indexes = select_cc_indexes(_cc, detection_threshold, INTEREVENT_TIME_RESOLUTION_SAMP)

fig = plt.figure("network_averaged_cc_w_threshold", figsize=(20, 6))
gs = fig.add_gridspec(ncols=4)

ax1 = fig.add_subplot(gs[:3])
ax1.plot(time_cc, _cc, lw=0.75)
ax1.scatter(time_cc[event_cc_indexes], _cc[event_cc_indexes], linewidths=0.25, edgecolor="k", color="r", zorder=2)
ax1.set_xlabel("Elapsed time (sec)")
ax1.set_ylabel("Network-averaged cc")
ax1.set_xlim(time_cc.min(), time_cc.max())
ax1.set_title("Time series of template/continuous seismograms similarity")

ax2 = fig.add_subplot(gs[3], sharey=ax1)
_ = ax2.hist(_cc, orientation="horizontal", bins=250, density=True, zorder=2)
ax2.set_xlabel("Normalized count")
ax2.set_title("Histogram")

label = f"Detection threshold: {NUM_RMS:.0f}"r"$\times \mathrm{RMS}(\mathrm{CC}(t))$"f"\n{len(event_cc_indexes):d} detected events"
for ax in [ax1, ax2]:
    ax.grid()
    ax.axhline(
        detection_threshold, ls="--", color="r",
        label=label
        )
ax1.legend(loc="upper left")

Out[35]:

<matplotlib.legend.Legend at 0x7fadd03341c0>

Use the trigger times to build the earthquake catalog and extract event waveforms on a given station/component.

In [39]:

Copied!





STATION_NAME = "CLC"
COMPONENT_NAME = "Z"
date = pd.Timestamp(
    (continuous_seismograms[0].stats.starttime.timestamp + continuous_seismograms[0].stats.endtime.timestamp) / 2.,
    unit="s"
).strftime("%Y-%m-%d")

sta_idx = station_codes.index(STATION_NAME)
cp_idx = component_codes.index(COMPONENT_NAME)

catalog = {
    "detection_time": [],
    "peak_amplitude": [],
    "cc": [],
    "normalized_cc": []
}

detected_event_waveforms = []
for i in range(len(event_cc_indexes)):
    idx_start = event_cc_indexes[i] * FMF_STEP_SAMP + moveouts_samp_arr[sta_idx, cp_idx]
    idx_end = idx_start + TEMPLATE_DURATION_SAMP
    detected_event_waveforms.append(continuous_seismograms_arr[sta_idx, cp_idx, idx_start:idx_end])
    # --------------------------------------
    detection_time = pd.Timestamp(date) + pd.Timedelta(event_cc_indexes[i] * FMF_STEP_SAMP / SAMPLING_RATE_HZ, "s")
    cc = _cc[event_cc_indexes[i]]
    normalized_cc = cc / detection_threshold
    peak_amplitude = np.abs(detected_event_waveforms[-1]).max()
    catalog["detection_time"].append(detection_time)
    catalog["peak_amplitude"].append(peak_amplitude)
    catalog["cc"].append(cc)
    catalog["normalized_cc"].append(normalized_cc)
detected_event_waveforms = np.asarray(detected_event_waveforms)
catalog = pd.DataFrame(catalog)
catalog
STATION_NAME = "CLC"
COMPONENT_NAME = "Z"
date = pd.Timestamp(
    (continuous_seismograms[0].stats.starttime.timestamp + continuous_seismograms[0].stats.endtime.timestamp) / 2.,
    unit="s"
).strftime("%Y-%m-%d")

sta_idx = station_codes.index(STATION_NAME)
cp_idx = component_codes.index(COMPONENT_NAME)

catalog = {
    "detection_time": [],
    "peak_amplitude": [],
    "cc": [],
    "normalized_cc": []
}

detected_event_waveforms = []
for i in range(len(event_cc_indexes)):
    idx_start = event_cc_indexes[i] * FMF_STEP_SAMP + moveouts_samp_arr[sta_idx, cp_idx]
    idx_end = idx_start + TEMPLATE_DURATION_SAMP
    detected_event_waveforms.append(continuous_seismograms_arr[sta_idx, cp_idx, idx_start:idx_end])
    # --------------------------------------
    detection_time = pd.Timestamp(date) + pd.Timedelta(event_cc_indexes[i] * FMF_STEP_SAMP / SAMPLING_RATE_HZ, "s")
    cc = _cc[event_cc_indexes[i]]
    normalized_cc = cc / detection_threshold
    peak_amplitude = np.abs(detected_event_waveforms[-1]).max()
    catalog["detection_time"].append(detection_time)
    catalog["peak_amplitude"].append(peak_amplitude)
    catalog["cc"].append(cc)
    catalog["normalized_cc"].append(normalized_cc)
detected_event_waveforms = np.asarray(detected_event_waveforms)
catalog = pd.DataFrame(catalog)
catalog

Out[39]:

	detection_time	peak_amplitude	cc	normalized_cc
0	2019-07-04 15:42:49.840	0.003044	0.167660	1.426171
1	2019-07-04 16:07:21.880	0.002178	0.162098	1.378858
2	2019-07-04 16:13:44.960	0.035267	0.252016	2.143727
3	2019-07-04 17:02:56.960	16.214684	1.000000	8.506305
4	2019-07-04 17:09:21.680	0.155795	0.339568	2.888470
5	2019-07-04 17:11:41.320	0.006684	0.120572	1.025621
6	2019-07-04 17:12:16.520	0.116193	0.181086	1.540373
7	2019-07-04 17:12:39.680	0.030746	0.150769	1.282485
8	2019-07-04 17:13:28.560	0.001898	0.156879	1.334461
9	2019-07-04 17:13:53.480	0.003748	0.125633	1.068674
10	2019-07-04 17:15:48.640	0.019283	0.254803	2.167428
11	2019-07-04 17:16:20.080	0.002100	0.200094	1.702063
12	2019-07-04 17:17:43.000	0.000707	0.123891	1.053855
13	2019-07-04 17:18:09.040	0.001339	0.118229	1.005688
14	2019-07-04 17:20:56.360	0.000434	0.154932	1.317895
15	2019-07-04 17:27:36.680	0.002425	0.143366	1.219511
16	2019-07-04 17:32:54.160	0.003028	0.174316	1.482784
17	2019-07-04 17:37:28.920	9.444638	0.129424	1.100916
18	2019-07-04 17:39:37.680	3.430590	0.118089	1.004503
19	2019-07-04 18:10:10.960	0.157595	0.127386	1.083580
20	2019-07-04 18:19:09.040	0.277436	0.121896	1.036887
21	2019-07-04 19:03:40.560	0.265189	0.148633	1.264321
22	2019-07-04 20:09:45.160	0.116027	0.146313	1.244580
23	2019-07-04 20:55:00.480	0.023483	0.179276	1.524972
24	2019-07-04 21:20:53.760	0.014506	0.133311	1.133981
25	2019-07-04 21:46:58.000	0.026888	0.132058	1.123324
26	2019-07-04 21:50:55.440	0.042805	0.216717	1.843457
27	2019-07-04 23:38:45.680	0.012385	0.130068	1.106398

Plot some waveforms¶

When building a catalog, it is always necessary to visualize some of the detected event waveforms to get a sense of the ratio of true-to-false detection rate.

In the following, we plot the waveforms of each detected event and we also compare the stack of all the waveforms to the original template waveform. Since all events share similar waveforms, the stack is similar to the template waveform. Moreover, since noise across all these waveforms sums up incoherently, stacking acts as a denoiser which may help you produce a cleaner version of the template waveform, for example on remote stations.

In [40]:

Copied!





fig = plt.figure("detected_event_waveforms", figsize=(10, 15))
gs = fig.add_gridspec(nrows=4)

ax1 = fig.add_subplot(gs[:3])

_time_wav = np.arange(detected_event_waveforms.shape[1]) / SAMPLING_RATE_HZ

stack = np.zeros(detected_event_waveforms.shape[1])
for i in range(detected_event_waveforms.shape[0]):
    norm = np.abs(detected_event_waveforms[i, :]).max()
    if catalog["cc"].iloc[i] > 0.999:
        color = "r"
        template_wav = detected_event_waveforms[i, :] / norm
    else:
        color = "k"
    time_of_day = catalog["detection_time"].iloc[i].strftime("%H:%M:%S")
    ax1.plot(_time_wav, detected_event_waveforms[i, :] / norm + i * 1.5, color=color)
    ax1.text(0.98 * _time_wav.max(), i * 1.5 + 0.1, time_of_day, ha="right", va="bottom")
    stack += detected_event_waveforms[i, :] / norm
stack /= np.abs(stack).max()
ax1.set_xlabel("Time (s)")
ax1.set_xlim(_time_wav.min(), _time_wav.max())
ax1.set_ylabel("Normalized offset amplitude")
ax1.set_title(f"Events detected on {date} and recorded by {STATION_NAME}.{COMPONENT_NAME}")

ax2 = fig.add_subplot(gs[3], sharex=ax1)
ax2.plot(_time_wav, stack, color="blue", label="Stacked waveforms")
ax2.plot(_time_wav, template_wav, color="red", ls="--", label="Template waveform")
ax2.legend(loc="upper left")
ax2.set_xlabel("Time (s)")
ax2.set_ylabel("Normalized amplitude")
fig = plt.figure("detected_event_waveforms", figsize=(10, 15))
gs = fig.add_gridspec(nrows=4)

ax1 = fig.add_subplot(gs[:3])

_time_wav = np.arange(detected_event_waveforms.shape[1]) / SAMPLING_RATE_HZ

stack = np.zeros(detected_event_waveforms.shape[1])
for i in range(detected_event_waveforms.shape[0]):
    norm = np.abs(detected_event_waveforms[i, :]).max()
    if catalog["cc"].iloc[i] > 0.999:
        color = "r"
        template_wav = detected_event_waveforms[i, :] / norm
    else:
        color = "k"
    time_of_day = catalog["detection_time"].iloc[i].strftime("%H:%M:%S")
    ax1.plot(_time_wav, detected_event_waveforms[i, :] / norm + i * 1.5, color=color)
    ax1.text(0.98 * _time_wav.max(), i * 1.5 + 0.1, time_of_day, ha="right", va="bottom")
    stack += detected_event_waveforms[i, :] / norm
stack /= np.abs(stack).max()
ax1.set_xlabel("Time (s)")
ax1.set_xlim(_time_wav.min(), _time_wav.max())
ax1.set_ylabel("Normalized offset amplitude")
ax1.set_title(f"Events detected on {date} and recorded by {STATION_NAME}.{COMPONENT_NAME}")

ax2 = fig.add_subplot(gs[3], sharex=ax1)
ax2.plot(_time_wav, stack, color="blue", label="Stacked waveforms")
ax2.plot(_time_wav, template_wav, color="red", ls="--", label="Template waveform")
ax2.legend(loc="upper left")
ax2.set_xlabel("Time (s)")
ax2.set_ylabel("Normalized amplitude")

Out[40]:

Text(0, 0.5, 'Normalized amplitude')

Plot template return time vs detection time¶

In [42]:

Copied!

catalog["return_time_s"] = catalog["detection_time"].diff().dt.total_seconds()
catalog
catalog["return_time_s"] = catalog["detection_time"].diff().dt.total_seconds()
catalog

Out[42]:

	detection_time	peak_amplitude	cc	normalized_cc	return_time_s
0	2019-07-04 15:42:49.840	0.003044	0.167660	1.426171	NaN
1	2019-07-04 16:07:21.880	0.002178	0.162098	1.378858	1472.04
2	2019-07-04 16:13:44.960	0.035267	0.252016	2.143727	383.08
3	2019-07-04 17:02:56.960	16.214684	1.000000	8.506305	2952.00
4	2019-07-04 17:09:21.680	0.155795	0.339568	2.888470	384.72
5	2019-07-04 17:11:41.320	0.006684	0.120572	1.025621	139.64
6	2019-07-04 17:12:16.520	0.116193	0.181086	1.540373	35.20
7	2019-07-04 17:12:39.680	0.030746	0.150769	1.282485	23.16
8	2019-07-04 17:13:28.560	0.001898	0.156879	1.334461	48.88
9	2019-07-04 17:13:53.480	0.003748	0.125633	1.068674	24.92
10	2019-07-04 17:15:48.640	0.019283	0.254803	2.167428	115.16
11	2019-07-04 17:16:20.080	0.002100	0.200094	1.702063	31.44
12	2019-07-04 17:17:43.000	0.000707	0.123891	1.053855	82.92
13	2019-07-04 17:18:09.040	0.001339	0.118229	1.005688	26.04
14	2019-07-04 17:20:56.360	0.000434	0.154932	1.317895	167.32
15	2019-07-04 17:27:36.680	0.002425	0.143366	1.219511	400.32
16	2019-07-04 17:32:54.160	0.003028	0.174316	1.482784	317.48
17	2019-07-04 17:37:28.920	9.444638	0.129424	1.100916	274.76
18	2019-07-04 17:39:37.680	3.430590	0.118089	1.004503	128.76
19	2019-07-04 18:10:10.960	0.157595	0.127386	1.083580	1833.28
20	2019-07-04 18:19:09.040	0.277436	0.121896	1.036887	538.08
21	2019-07-04 19:03:40.560	0.265189	0.148633	1.264321	2671.52
22	2019-07-04 20:09:45.160	0.116027	0.146313	1.244580	3964.60
23	2019-07-04 20:55:00.480	0.023483	0.179276	1.524972	2715.32
24	2019-07-04 21:20:53.760	0.014506	0.133311	1.133981	1553.28
25	2019-07-04 21:46:58.000	0.026888	0.132058	1.123324	1564.24
26	2019-07-04 21:50:55.440	0.042805	0.216717	1.843457	237.44
27	2019-07-04 23:38:45.680	0.012385	0.130068	1.106398	6470.24

In [43]:

Copied!





fig, ax = plt.subplots(num="return_time_vs_detection_time", figsize=(16, 7))

ax.scatter(catalog["detection_time"], catalog["return_time_s"], c=catalog["cc"], linewidths=0.25, edgecolor="k", cmap="magma", s=50)
ax.set_xlabel("Detection time")
ax.set_ylabel("Return time (s)")
ax.set_title("Return time vs detection time")
ax.grid()
cbar = plt.colorbar(ax.collections[0], ax=ax)
cbar.set_label("CC")
ax.set_yscale("log")
fig, ax = plt.subplots(num="return_time_vs_detection_time", figsize=(16, 7))

ax.scatter(catalog["detection_time"], catalog["return_time_s"], c=catalog["cc"], linewidths=0.25, edgecolor="k", cmap="magma", s=50)
ax.set_xlabel("Detection time")
ax.set_ylabel("Return time (s)")
ax.set_title("Return time vs detection time")
ax.grid()
cbar = plt.colorbar(ax.collections[0], ax=ax)
cbar.set_label("CC")
ax.set_yscale("log")

Relative magnitude¶

Template matching lends itself well to quickly estimate relative event magnitudes, relative to the magnitude of the template event.

$$ M_r = M_{\mathrm{ref}} + \dfrac{1}{N} \sum_{ch=1}^{N} \log \dfrac{A_{ch}}{A_{\mathrm{ref},ch}} $$

where the sum is taken over $N$ channels and $A_{ch}$ is the peak amplitude measured on channel $ch$ while $A_{\mathrm{ref},ch}$ is the peak amplitude of the reference event (the template) measured on the same channel.

In [44]:

Copied!

M_REF = selected_event_meta.magnitude
M_REF = selected_event_meta.magnitude

We can use the peak amplitudes that we read before.

In [45]:

Copied!

template_index = np.where(catalog["cc"] > 0.999)[0][0]

m_rel = M_REF + np.log10(catalog["peak_amplitude"] / catalog["peak_amplitude"].iloc[template_index])

catalog["m_rel_one_channel"] = m_rel
catalog
template_index = np.where(catalog["cc"] > 0.999)[0][0]

m_rel = M_REF + np.log10(catalog["peak_amplitude"] / catalog["peak_amplitude"].iloc[template_index])

catalog["m_rel_one_channel"] = m_rel
catalog

Out[45]:

	detection_time	peak_amplitude	cc	normalized_cc	return_time_s	m_rel_one_channel
0	2019-07-04 15:42:49.840	0.003044	0.167660	1.426171	NaN	0.750315
1	2019-07-04 16:07:21.880	0.002178	0.162098	1.378858	1472.04	0.604785
2	2019-07-04 16:13:44.960	0.035267	0.252016	2.143727	383.08	1.814178
3	2019-07-04 17:02:56.960	16.214684	1.000000	8.506305	2952.00	4.476724
4	2019-07-04 17:09:21.680	0.155795	0.339568	2.888470	384.72	2.459368
5	2019-07-04 17:11:41.320	0.006684	0.120572	1.025621	139.64	1.091826
6	2019-07-04 17:12:16.520	0.116193	0.181086	1.540373	35.20	2.331995
7	2019-07-04 17:12:39.680	0.030746	0.150769	1.282485	23.16	1.754599
8	2019-07-04 17:13:28.560	0.001898	0.156879	1.334461	48.88	0.545083
9	2019-07-04 17:13:53.480	0.003748	0.125633	1.068674	24.92	0.840592
10	2019-07-04 17:15:48.640	0.019283	0.254803	2.167428	115.16	1.551985
11	2019-07-04 17:16:20.080	0.002100	0.200094	1.702063	31.44	0.588959
12	2019-07-04 17:17:43.000	0.000707	0.123891	1.053855	82.92	0.116397
13	2019-07-04 17:18:09.040	0.001339	0.118229	1.005688	26.04	0.393615
14	2019-07-04 17:20:56.360	0.000434	0.154932	1.317895	167.32	-0.096164
15	2019-07-04 17:27:36.680	0.002425	0.143366	1.219511	400.32	0.651551
16	2019-07-04 17:32:54.160	0.003028	0.174316	1.482784	317.48	0.747959
17	2019-07-04 17:37:28.920	9.444638	0.129424	1.100916	274.76	4.242001
18	2019-07-04 17:39:37.680	3.430590	0.118089	1.004503	128.76	3.802184
19	2019-07-04 18:10:10.960	0.157595	0.127386	1.083580	1833.28	2.464358
20	2019-07-04 18:19:09.040	0.277436	0.121896	1.036887	538.08	2.709977
21	2019-07-04 19:03:40.560	0.265189	0.148633	1.264321	2671.52	2.690371
22	2019-07-04 20:09:45.160	0.116027	0.146313	1.244580	3964.60	2.331373
23	2019-07-04 20:55:00.480	0.023483	0.179276	1.524972	2715.32	1.637578
24	2019-07-04 21:20:53.760	0.014506	0.133311	1.133981	1553.28	1.428355
25	2019-07-04 21:46:58.000	0.026888	0.132058	1.123324	1564.24	1.696379
26	2019-07-04 21:50:55.440	0.042805	0.216717	1.843457	237.44	1.898305
27	2019-07-04 23:38:45.680	0.012385	0.130068	1.106398	6470.24	1.359715

However, it is much better to average the log ratios over multiple channels.

In [46]:

Copied!





# first, extract clips from the continuous seismograms
detected_event_waveforms_all_channels = np.zeros(
    (len(catalog), num_stations, num_channels, TEMPLATE_DURATION_SAMP), dtype=np.float32
)
for i in range(len(catalog)):
    for s in range(num_stations):
        for c in range(num_channels):
            idx_start = event_cc_indexes[i] * FMF_STEP_SAMP + moveouts_samp_arr[sta_idx, cp_idx]
            idx_end = idx_start + TEMPLATE_DURATION_SAMP
            detected_event_waveforms_all_channels[i, s, c, :] = (
                continuous_seismograms_arr[s, c, idx_start:idx_end]
                )
# first, extract clips from the continuous seismograms
detected_event_waveforms_all_channels = np.zeros(
    (len(catalog), num_stations, num_channels, TEMPLATE_DURATION_SAMP), dtype=np.float32
)
for i in range(len(catalog)):
    for s in range(num_stations):
        for c in range(num_channels):
            idx_start = event_cc_indexes[i] * FMF_STEP_SAMP + moveouts_samp_arr[sta_idx, cp_idx]
            idx_end = idx_start + TEMPLATE_DURATION_SAMP
            detected_event_waveforms_all_channels[i, s, c, :] = (
                continuous_seismograms_arr[s, c, idx_start:idx_end]
                )

In [47]:

Copied!





# then, measure the peak amplitudes
amplitude_ratios = (
    np.max(detected_event_waveforms_all_channels, axis=-1)
    / np.max(detected_event_waveforms_all_channels[template_index, ...], axis=-1)[None, ...]
)

# if some data were missing, we may have divided by 0 and nans or infs
# mask them and ignore them in the following calculation
invalid = (np.isnan(amplitude_ratios) | np.isinf(amplitude_ratios))
amplitude_ratios = np.ma.masked_where(invalid, amplitude_ratios)

# apply the above formula to get the relative magnitudes
m_rel_all_channels = M_REF + np.ma.mean(np.log10(amplitude_ratios), axis=(1, 2))

catalog["m_rel_all_channels"] = m_rel_all_channels
catalog
# then, measure the peak amplitudes
amplitude_ratios = (
    np.max(detected_event_waveforms_all_channels, axis=-1)
    / np.max(detected_event_waveforms_all_channels[template_index, ...], axis=-1)[None, ...]
)

# if some data were missing, we may have divided by 0 and nans or infs
# mask them and ignore them in the following calculation
invalid = (np.isnan(amplitude_ratios) | np.isinf(amplitude_ratios))
amplitude_ratios = np.ma.masked_where(invalid, amplitude_ratios)

# apply the above formula to get the relative magnitudes
m_rel_all_channels = M_REF + np.ma.mean(np.log10(amplitude_ratios), axis=(1, 2))

catalog["m_rel_all_channels"] = m_rel_all_channels
catalog

/tmp/ipykernel_471469/457280950.py:3: RuntimeWarning: invalid value encountered in true_divide
  np.max(detected_event_waveforms_all_channels, axis=-1)

Out[47]:

	detection_time	peak_amplitude	cc	normalized_cc	return_time_s	m_rel_one_channel	m_rel_all_channels
0	2019-07-04 15:42:49.840	0.003044	0.167660	1.426171	NaN	0.750315	0.948512
1	2019-07-04 16:07:21.880	0.002178	0.162098	1.378858	1472.04	0.604785	0.914793
2	2019-07-04 16:13:44.960	0.035267	0.252016	2.143727	383.08	1.814178	1.781098
3	2019-07-04 17:02:56.960	16.214684	1.000000	8.506305	2952.00	4.476724	4.476724
4	2019-07-04 17:09:21.680	0.155795	0.339568	2.888470	384.72	2.459368	2.556656
5	2019-07-04 17:11:41.320	0.006684	0.120572	1.025621	139.64	1.091826	1.440504
6	2019-07-04 17:12:16.520	0.116193	0.181086	1.540373	35.20	2.331995	2.389888
7	2019-07-04 17:12:39.680	0.030746	0.150769	1.282485	23.16	1.754599	2.043036
8	2019-07-04 17:13:28.560	0.001898	0.156879	1.334461	48.88	0.545083	1.010687
9	2019-07-04 17:13:53.480	0.003748	0.125633	1.068674	24.92	0.840592	1.131839
10	2019-07-04 17:15:48.640	0.019283	0.254803	2.167428	115.16	1.551985	1.764089
11	2019-07-04 17:16:20.080	0.002100	0.200094	1.702063	31.44	0.588959	1.078424
12	2019-07-04 17:17:43.000	0.000707	0.123891	1.053855	82.92	0.116397	0.674097
13	2019-07-04 17:18:09.040	0.001339	0.118229	1.005688	26.04	0.393615	0.755659
14	2019-07-04 17:20:56.360	0.000434	0.154932	1.317895	167.32	-0.096164	0.674497
15	2019-07-04 17:27:36.680	0.002425	0.143366	1.219511	400.32	0.651551	1.081399
16	2019-07-04 17:32:54.160	0.003028	0.174316	1.482784	317.48	0.747959	1.004731
17	2019-07-04 17:37:28.920	9.444638	0.129424	1.100916	274.76	4.242001	4.492626
18	2019-07-04 17:39:37.680	3.430590	0.118089	1.004503	128.76	3.802184	4.082750
19	2019-07-04 18:10:10.960	0.157595	0.127386	1.083580	1833.28	2.464358	2.506171
20	2019-07-04 18:19:09.040	0.277436	0.121896	1.036887	538.08	2.709977	3.136539
21	2019-07-04 19:03:40.560	0.265189	0.148633	1.264321	2671.52	2.690371	2.455167
22	2019-07-04 20:09:45.160	0.116027	0.146313	1.244580	3964.60	2.331373	2.180167
23	2019-07-04 20:55:00.480	0.023483	0.179276	1.524972	2715.32	1.637578	1.975095
24	2019-07-04 21:20:53.760	0.014506	0.133311	1.133981	1553.28	1.428355	1.859795
25	2019-07-04 21:46:58.000	0.026888	0.132058	1.123324	1564.24	1.696379	2.244630
26	2019-07-04 21:50:55.440	0.042805	0.216717	1.843457	237.44	1.898305	2.152389
27	2019-07-04 23:38:45.680	0.012385	0.130068	1.106398	6470.24	1.359715	1.358247

Next, we plot the distribution of earthquake magnitudes. The cumulative distribution usually follows the so-called Gutenberg-Richter law: $$ \log N(m \geq M) = a - b M$$ In this example, the number of events is too low to estimate a meaningful b-value.

In [ ]:

Copied!





fig, ax = plt.subplots(num="magnitude_distribution", figsize=(8, 8))
axb = ax.twinx()
ax.set_title("Distribution of earthquake magnitudes")

_ = ax.hist(catalog["m_rel_all_channels"], bins=15, color="k", alpha=0.5)
axb.plot(np.sort(catalog["m_rel_all_channels"]), np.arange(len(catalog))[::-1], color="k", lw=2)

ax.set_xlabel("Magnitude")
ax.set_ylabel("Event Count")
axb.set_ylabel("Cumulative Event Count")

for ax in [ax, axb]:
    ax.set_yscale("log")
fig, ax = plt.subplots(num="magnitude_distribution", figsize=(8, 8))
axb = ax.twinx()
ax.set_title("Distribution of earthquake magnitudes")

_ = ax.hist(catalog["m_rel_all_channels"], bins=15, color="k", alpha=0.5)
axb.plot(np.sort(catalog["m_rel_all_channels"]), np.arange(len(catalog))[::-1], color="k", lw=2)

ax.set_xlabel("Magnitude")
ax.set_ylabel("Event Count")
axb.set_ylabel("Cumulative Event Count")

for ax in [ax, axb]:
    ax.set_yscale("log")

Template matching with a single template¶

Data requirements¶

Installing fast_matched_filter¶

Load PhaseNet catalog¶

Pick one event¶

Run template matching¶

Read data from same day¶

Build template¶

------------------ Background ------------------¶

------------------ On the necessity to clip template waveforms out of numpy.ndarray instead of obspy.Stream ------------------¶

Clip out waveforms and moveout and station-weight arrays¶

Run FMF¶

Set detection threshold and find events¶

Plot some waveforms¶

Plot template return time vs detection time¶

Relative magnitude¶

Installing `fast_matched_filter`¶

------------------ On the necessity to clip template waveforms out of `numpy.ndarray` instead of `obspy.Stream` ------------------¶