Demonstration of a common issue encountered in template matching¶
Templates are selected from Weiqiang Zhu's PhaseNet catalog.
Data requirements¶
Download the seismic data at: https://doi.org/10.5281/zenodo.15097180
Download the PhaseNet earthquake catalog with the following three commands:
- curl -o adloc_events.csv https://storage.googleapis.com/quakeflow_share/Ridgecrest/adloc/adloc_events.csv
- curl -o adloc_picks.csv https://storage.googleapis.com/quakeflow_share/Ridgecrest/adloc/adloc_picks.csv
- curl -o adloc_stations.csv https://storage.googleapis.com/quakeflow_share/Ridgecrest/adloc/adloc_stations.csv
Installing fast_matched_filter
¶
This example uses fast_matched_filter
, a Python wrapper for C and CUDA-C routines. The C and CUDA-C libraries have to be compiled. Read the instructions at https://ebeauce.github.io/FMF_documentation/introduction.html.
import os
import fast_matched_filter as fmf
import glob
import numpy as np
import matplotlib.pyplot as plt
import obspy as obs
import pandas as pd
# path variables and file names
DIR_WAVEFORMS = "/home/ebeauce/SSA_EQ_DETECTION_WORKSHOP/data" # REPLACE THIS PATH WITH WHEREVER YOU DOWNLOADED THE DATA
DIR_CATALOG = "../picks_phasenet/"
STATION_FILE = "adloc_stations.csv"
EVENT_FILE = "adloc_events.csv"
PICK_FILE = "adloc_picks.csv"
Load PhaseNet catalog¶
Here, we read the catalog of the 2019 Ridgecrest sequence made with PhaseNet. Information is divided into three files:
- a station metadata file,
- an event metadata file (the catalog per se),
- a pick database, which contains all the P- and S-wave picks found by PhaseNet.
station_meta = pd.read_csv(os.path.join(DIR_CATALOG, STATION_FILE))
station_meta
network | station | location | instrument | component | latitude | longitude | elevation_m | depth_km | provider | station_id | station_term_time_p | station_term_time_s | station_term_amplitude | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | CI | CCC | NaN | BH | ENZ | 35.524950 | -117.364530 | 670.0 | -0.6700 | SCEDC | CI.CCC..BH | 0.266086 | 0.500831 | 0.049399 |
1 | CI | CCC | NaN | HH | ENZ | 35.524950 | -117.364530 | 670.0 | -0.6700 | SCEDC | CI.CCC..HH | 0.295428 | 0.518465 | 0.191475 |
2 | CI | CCC | NaN | HN | ENZ | 35.524950 | -117.364530 | 670.0 | -0.6700 | SCEDC | CI.CCC..HN | 0.296263 | 0.541148 | 0.064485 |
3 | CI | CLC | NaN | BH | ENZ | 35.815740 | -117.597510 | 775.0 | -0.7750 | SCEDC | CI.CLC..BH | -0.231963 | -0.415271 | -0.331371 |
4 | CI | CLC | NaN | HH | ENZ | 35.815740 | -117.597510 | 775.0 | -0.7750 | SCEDC | CI.CLC..HH | -0.168743 | -0.390045 | -0.140313 |
5 | CI | CLC | NaN | HN | ENZ | 35.815740 | -117.597510 | 775.0 | -0.7750 | SCEDC | CI.CLC..HN | -0.175671 | -0.388116 | -0.249066 |
6 | CI | DTP | NaN | BH | ENZ | 35.267420 | -117.845810 | 908.0 | -0.9080 | SCEDC | CI.DTP..BH | -0.305881 | -0.602459 | -0.503411 |
7 | CI | DTP | NaN | HH | ENZ | 35.267420 | -117.845810 | 908.0 | -0.9080 | SCEDC | CI.DTP..HH | -0.263705 | -0.564867 | -0.437951 |
8 | CI | DTP | NaN | HN | ENZ | 35.267420 | -117.845810 | 908.0 | -0.9080 | SCEDC | CI.DTP..HN | -0.244383 | -0.538990 | -0.500516 |
9 | CI | JRC2 | NaN | BH | ENZ | 35.982490 | -117.808850 | 1469.0 | -1.4690 | SCEDC | CI.JRC2..BH | 0.011361 | -0.080285 | -0.039941 |
10 | CI | JRC2 | NaN | HH | ENZ | 35.982490 | -117.808850 | 1469.0 | -1.4690 | SCEDC | CI.JRC2..HH | 0.053539 | -0.052748 | 0.068213 |
11 | CI | JRC2 | NaN | HN | ENZ | 35.982490 | -117.808850 | 1469.0 | -1.4690 | SCEDC | CI.JRC2..HN | 0.059764 | -0.045991 | -0.007637 |
12 | CI | LRL | NaN | BH | ENZ | 35.479540 | -117.682120 | 1340.0 | -1.3400 | SCEDC | CI.LRL..BH | -0.295604 | -0.540857 | 0.033788 |
13 | CI | LRL | NaN | HH | ENZ | 35.479540 | -117.682120 | 1340.0 | -1.3400 | SCEDC | CI.LRL..HH | -0.268381 | -0.513955 | 0.146876 |
14 | CI | LRL | NaN | HN | ENZ | 35.479540 | -117.682120 | 1340.0 | -1.3400 | SCEDC | CI.LRL..HN | -0.266329 | -0.503088 | 0.045499 |
15 | CI | LRL | 2C | HN | ENZ | 35.479540 | -117.682120 | 1340.0 | -1.3400 | SCEDC | CI.LRL.2C.HN | 0.000000 | 0.000000 | 0.000000 |
16 | CI | MPM | NaN | BH | ENZ | 36.057990 | -117.489010 | 1839.0 | -1.8390 | SCEDC | CI.MPM..BH | -0.011825 | -0.098089 | -0.518793 |
17 | CI | MPM | NaN | HH | ENZ | 36.057990 | -117.489010 | 1839.0 | -1.8390 | SCEDC | CI.MPM..HH | 0.009095 | -0.081896 | -0.459349 |
18 | CI | MPM | NaN | HN | ENZ | 36.057990 | -117.489010 | 1839.0 | -1.8390 | SCEDC | CI.MPM..HN | 0.011870 | -0.052277 | -0.532764 |
19 | CI | Q0072 | 01 | HN | ENZ | 35.609617 | -117.666721 | 695.0 | -0.6950 | SCEDC | CI.Q0072.01.HN | 0.000000 | 0.000000 | 0.000000 |
20 | CI | SLA | NaN | BH | ENZ | 35.890950 | -117.283320 | 1174.0 | -1.1740 | SCEDC | CI.SLA..BH | 0.066893 | 0.118500 | -0.081634 |
21 | CI | SLA | NaN | HH | ENZ | 35.890950 | -117.283320 | 1174.0 | -1.1740 | SCEDC | CI.SLA..HH | 0.089833 | 0.128589 | -0.042928 |
22 | CI | SLA | NaN | HN | ENZ | 35.890950 | -117.283320 | 1174.0 | -1.1740 | SCEDC | CI.SLA..HN | 0.093526 | 0.168238 | -0.150381 |
23 | CI | SRT | NaN | BH | ENZ | 35.692350 | -117.750510 | 667.0 | -0.6670 | SCEDC | CI.SRT..BH | 0.148171 | 0.653411 | -0.253527 |
24 | CI | SRT | NaN | HH | ENZ | 35.692350 | -117.750510 | 667.0 | -0.6670 | SCEDC | CI.SRT..HH | 0.183868 | 0.641707 | -0.208859 |
25 | CI | SRT | NaN | HN | ENZ | 35.692350 | -117.750510 | 667.0 | -0.6670 | SCEDC | CI.SRT..HN | 0.175475 | 0.674587 | -0.295869 |
26 | CI | TOW2 | NaN | BH | ENZ | 35.808560 | -117.764880 | 685.0 | -0.6850 | SCEDC | CI.TOW2..BH | 0.268083 | 0.816279 | 0.028038 |
27 | CI | TOW2 | NaN | HH | ENZ | 35.808560 | -117.764880 | 685.0 | -0.6850 | SCEDC | CI.TOW2..HH | 0.285970 | 0.831533 | 0.102681 |
28 | CI | TOW2 | NaN | HN | ENZ | 35.808560 | -117.764880 | 685.0 | -0.6850 | SCEDC | CI.TOW2..HN | 0.285113 | 0.843886 | -0.006698 |
29 | CI | WBM | NaN | BH | ENZ | 35.608390 | -117.890490 | 892.0 | -0.8920 | SCEDC | CI.WBM..BH | 0.136927 | 0.128744 | -0.166279 |
30 | CI | WBM | NaN | HH | ENZ | 35.608390 | -117.890490 | 892.0 | -0.8920 | SCEDC | CI.WBM..HH | 0.152299 | 0.124250 | -0.138614 |
31 | CI | WBM | NaN | HN | ENZ | 35.608390 | -117.890490 | 892.0 | -0.8920 | SCEDC | CI.WBM..HN | 0.170719 | 0.179461 | -0.101436 |
32 | CI | WBM | 2C | HN | ENZ | 35.608390 | -117.890490 | 892.0 | -0.8920 | SCEDC | CI.WBM.2C.HN | 0.000000 | 0.000000 | 0.000000 |
33 | CI | WCS2 | NaN | BH | ENZ | 36.025210 | -117.765260 | 1143.0 | -1.1430 | SCEDC | CI.WCS2..BH | 0.030349 | -0.126935 | 0.023642 |
34 | CI | WCS2 | NaN | HH | ENZ | 36.025210 | -117.765260 | 1143.0 | -1.1430 | SCEDC | CI.WCS2..HH | 0.065321 | -0.099447 | 0.146628 |
35 | CI | WCS2 | NaN | HN | ENZ | 36.025210 | -117.765260 | 1143.0 | -1.1430 | SCEDC | CI.WCS2..HN | 0.061651 | -0.096450 | 0.037563 |
36 | CI | WMF | NaN | BH | ENZ | 36.117580 | -117.854860 | 1537.4 | -1.5374 | SCEDC | CI.WMF..BH | 0.039416 | -0.005761 | -0.165183 |
37 | CI | WMF | NaN | HH | ENZ | 36.117580 | -117.854860 | 1537.4 | -1.5374 | SCEDC | CI.WMF..HH | 0.075658 | 0.005327 | -0.079900 |
38 | CI | WMF | NaN | HN | ENZ | 36.117580 | -117.854860 | 1537.4 | -1.5374 | SCEDC | CI.WMF..HN | 0.085427 | 0.025273 | -0.240971 |
39 | CI | WMF | 2C | HN | ENZ | 36.117580 | -117.854860 | 1537.4 | -1.5374 | SCEDC | CI.WMF.2C.HN | 0.000000 | 0.000000 | 0.000000 |
40 | CI | WNM | NaN | EH | Z | 35.842200 | -117.906160 | 974.3 | -0.9743 | SCEDC | CI.WNM..EH | -0.026889 | -0.070269 | -0.332510 |
41 | CI | WNM | NaN | HN | ENZ | 35.842200 | -117.906160 | 974.3 | -0.9743 | SCEDC | CI.WNM..HN | -0.011659 | -0.118223 | -0.038719 |
42 | CI | WNM | 2C | HN | ENZ | 35.842200 | -117.906160 | 974.3 | -0.9743 | SCEDC | CI.WNM.2C.HN | 0.000000 | 0.000000 | 0.000000 |
43 | CI | WRC2 | NaN | BH | ENZ | 35.947900 | -117.650380 | 943.0 | -0.9430 | SCEDC | CI.WRC2..BH | -0.023860 | -0.043523 | 0.117833 |
44 | CI | WRC2 | NaN | HH | ENZ | 35.947900 | -117.650380 | 943.0 | -0.9430 | SCEDC | CI.WRC2..HH | 0.010633 | -0.029488 | 0.165234 |
45 | CI | WRC2 | NaN | HN | ENZ | 35.947900 | -117.650380 | 943.0 | -0.9430 | SCEDC | CI.WRC2..HN | 0.014658 | -0.021367 | 0.103905 |
46 | CI | WRV2 | NaN | EH | Z | 36.007740 | -117.890400 | 1070.0 | -1.0700 | SCEDC | CI.WRV2..EH | -0.003461 | -0.154572 | -0.355199 |
47 | CI | WRV2 | NaN | HN | ENZ | 36.007740 | -117.890400 | 1070.0 | -1.0700 | SCEDC | CI.WRV2..HN | 0.017317 | -0.137916 | -0.273270 |
48 | CI | WRV2 | 2C | HN | ENZ | 36.007740 | -117.890400 | 1070.0 | -1.0700 | SCEDC | CI.WRV2.2C.HN | 0.000000 | 0.000000 | 0.000000 |
49 | CI | WVP2 | NaN | EH | Z | 35.949390 | -117.817690 | 1465.0 | -1.4650 | SCEDC | CI.WVP2..EH | 0.020325 | 0.008989 | -0.341606 |
50 | CI | WVP2 | NaN | HN | ENZ | 35.949390 | -117.817690 | 1465.0 | -1.4650 | SCEDC | CI.WVP2..HN | 0.024742 | -0.103024 | -0.089014 |
51 | CI | WVP2 | 2C | HN | ENZ | 35.949390 | -117.817690 | 1465.0 | -1.4650 | SCEDC | CI.WVP2.2C.HN | 0.000000 | 0.000000 | 0.000000 |
The following shows a very rudimentary map of the station network. Look into the cartopy
package for more sophisticated maps.
_station_meta = station_meta.drop_duplicates("station")
fig, ax = plt.subplots(num="station_network", figsize=(10, 10))
ax.scatter(_station_meta["longitude"], _station_meta["latitude"], marker="v", color="k")
for idx, row in _station_meta.iterrows():
ax.text(row.longitude + 0.01, row.latitude + 0.01, row.station, va="bottom", ha="left")
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
ax.grid()
ax.set_title("Stations used to build the PhaseNet catalog")
Text(0.5, 1.0, 'Stations used to build the PhaseNet catalog')
event_meta = pd.read_csv(os.path.join(DIR_CATALOG, EVENT_FILE))
event_meta
time | adloc_score | adloc_residual_time | num_picks | magnitude | adloc_residual_amplitude | event_index | longitude | latitude | depth_km | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 2019-07-04 00:46:47.342963596 | 0.866596 | 0.057355 | 35 | 0.595368 | 0.165480 | 6720 | -117.882570 | 36.091088 | 4.643969 |
1 | 2019-07-04 00:55:32.648412579 | 0.781116 | 0.203567 | 16 | 1.142510 | 0.170509 | 17122 | -117.799226 | 35.378160 | 11.078458 |
2 | 2019-07-04 00:56:37.232733104 | 0.908073 | 0.086183 | 42 | 0.912494 | 0.166681 | 5411 | -117.880902 | 36.091986 | 4.854336 |
3 | 2019-07-04 02:00:39.149363202 | 0.814322 | 0.036164 | 15 | 0.209530 | 0.092534 | 17868 | -117.866468 | 36.093520 | 4.981447 |
4 | 2019-07-04 03:05:31.018885833 | 0.799281 | 0.080708 | 11 | 0.104050 | 0.156833 | 25037 | -117.846320 | 36.100386 | 5.943363 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
20898 | 2019-07-09 23:58:14.298499048 | 0.933013 | 0.097816 | 67 | 1.543257 | 0.131523 | 1907 | -117.711811 | 35.926468 | 6.652020 |
20899 | 2019-07-09 23:58:47.701746285 | 0.882434 | 0.090185 | 73 | 1.087089 | 0.140159 | 2051 | -117.604263 | 35.797450 | 6.617413 |
20900 | 2019-07-09 23:59:05.102247662 | 0.798047 | 0.435077 | 26 | 1.147040 | 0.170994 | 12567 | -117.509729 | 35.692722 | 12.815041 |
20901 | 2019-07-09 23:59:40.257837813 | 0.971081 | 0.065523 | 35 | 1.161323 | 0.068586 | 7726 | -117.846289 | 36.061435 | 5.224666 |
20902 | 2019-07-09 23:59:49.650466544 | 0.800524 | 0.023674 | 15 | 0.936622 | 0.095959 | 18566 | -117.896100 | 36.095886 | 6.761860 |
20903 rows × 10 columns
picks = pd.read_csv(os.path.join(DIR_CATALOG, PICK_FILE))
picks
station_id | phase_index | phase_time | phase_score | phase_type | dt_s | phase_polarity | phase_amplitude | event_index | sp_ratio | event_amplitude | adloc_mask | adloc_residual_time | adloc_residual_amplitude | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | CI.WMF..BH | 280874 | 2019-07-04 00:46:48.759 | 0.594 | P | 0.01 | 0.110 | -3.213107 | 6720 | 0.955091 | 3.124152e-07 | 1 | 0.003919 | 0.012320 |
1 | CI.WMF..HH | 280881 | 2019-07-04 00:46:48.818 | 0.973 | P | 0.01 | 0.938 | -3.077638 | 6720 | 0.948896 | 3.368390e-07 | 1 | 0.026491 | 0.063669 |
2 | CI.WMF..HN | 280881 | 2019-07-04 00:46:48.818 | 0.973 | P | 0.01 | 0.898 | -3.128019 | 6720 | 1.598008 | 2.146277e-07 | 1 | 0.017580 | 0.173702 |
3 | CI.WRV2..EH | 280945 | 2019-07-04 00:46:49.450 | 0.977 | P | 0.01 | -0.855 | -3.464453 | 6720 | 1.000000 | 3.298200e-07 | 1 | 0.077194 | 0.244381 |
4 | CI.WRV2..HN | 280945 | 2019-07-04 00:46:49.450 | 0.941 | P | 0.01 | -0.906 | -3.585863 | 6720 | 1.147235 | 3.908305e-07 | 1 | 0.056694 | 0.041691 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
955700 | CI.WRV2..HN | 8639355 | 2019-07-09 23:59:53.550 | 0.688 | S | 0.01 | -0.022 | -3.384471 | 18566 | 0.644492 | 1.973834e-06 | 1 | 0.019093 | 0.028409 |
955701 | CI.JRC2..HH | 8639499 | 2019-07-09 23:59:54.998 | 0.402 | S | 0.01 | -0.023 | -3.446238 | 18566 | 0.647843 | 6.197256e-06 | 1 | 0.020364 | -0.154404 |
955702 | CI.JRC2..HN | 8639499 | 2019-07-09 23:59:54.998 | 0.344 | S | 0.01 | -0.016 | -3.531800 | 18566 | 0.731796 | 4.075353e-06 | 1 | 0.012773 | -0.164659 |
955703 | CI.WVP2..EH | 8639574 | 2019-07-09 23:59:55.740 | 0.314 | S | 0.01 | -0.103 | -3.488117 | 18566 | 0.636005 | 2.815148e-06 | 1 | -0.092233 | 0.316712 |
955704 | CI.WVP2..HN | 8639576 | 2019-07-09 23:59:55.760 | 0.695 | S | 0.01 | 0.076 | -3.218676 | 18566 | 0.825223 | 4.657352e-06 | 1 | 0.039319 | 0.332722 |
955705 rows × 14 columns
Pick one event¶
Let's use the PhaseNet catalog to read the waveforms of an event.
def fetch_event_waveforms(
event_picks,
dir_waveforms=DIR_WAVEFORMS,
time_before_phase_onset_sec=2.0,
duration_sec=10.0
):
"""
Fetches the waveforms for a given event based on the picks.
Parameters
----------
event_picks : pandas.DataFrame
DataFrame containing the picks for the event.
dir_waveforms : str, optional
Directory where the waveform data is stored, by default DIR_WAVEFORMS.
time_before_phase_onset_sec : float, optional
Time in seconds to start the waveform before the phase onset, by default 2.0.
duration_sec : float, optional
Duration in seconds of the waveform to fetch, by default 10.0.
Returns
-------
obspy.Stream
Stream object containing the fetched waveforms.
"""
stream = obs.Stream()
for _, pick in event_picks.iterrows():
# check whether we have a miniseed file for this waveform
if pick.phase_type == "P":
files = glob.glob(os.path.join(dir_waveforms, pick.station_id + "Z*mseed"))
elif pick.phase_type == "S":
files = glob.glob(os.path.join(dir_waveforms, pick.station_id + "[N,E]*mseed"))
starttime = obs.UTCDateTime(pick.phase_time) - time_before_phase_onset_sec
endtime = starttime + duration_sec
for _file in files:
stream += obs.read(
_file,
starttime=starttime,
endtime=endtime
)
return stream
# explore event_meta to find a nice intermediate-size earthquake we could plot
event_meta.head(20)
time | adloc_score | adloc_residual_time | num_picks | magnitude | adloc_residual_amplitude | event_index | longitude | latitude | depth_km | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 2019-07-04 00:46:47.342963596 | 0.866596 | 0.057355 | 35 | 0.595368 | 0.165480 | 6720 | -117.882570 | 36.091088 | 4.643969 |
1 | 2019-07-04 00:55:32.648412579 | 0.781116 | 0.203567 | 16 | 1.142510 | 0.170509 | 17122 | -117.799226 | 35.378160 | 11.078458 |
2 | 2019-07-04 00:56:37.232733104 | 0.908073 | 0.086183 | 42 | 0.912494 | 0.166681 | 5411 | -117.880902 | 36.091986 | 4.854336 |
3 | 2019-07-04 02:00:39.149363202 | 0.814322 | 0.036164 | 15 | 0.209530 | 0.092534 | 17868 | -117.866468 | 36.093520 | 4.981447 |
4 | 2019-07-04 03:05:31.018885833 | 0.799281 | 0.080708 | 11 | 0.104050 | 0.156833 | 25037 | -117.846320 | 36.100386 | 5.943363 |
5 | 2019-07-04 03:20:28.674438914 | 0.755451 | 0.050436 | 44 | 0.869927 | 0.111885 | 6929 | -117.673684 | 36.114308 | 5.894466 |
6 | 2019-07-04 04:03:01.619369274 | 0.897311 | 0.143631 | 32 | 0.386310 | 0.214381 | 7607 | -117.805077 | 36.016063 | 0.396071 |
7 | 2019-07-04 05:16:47.223353119 | 0.835094 | 0.053522 | 22 | 0.575433 | 0.163675 | 13098 | -117.879256 | 36.090409 | 4.268292 |
8 | 2019-07-04 06:57:32.758991812 | 0.922386 | 0.065846 | 23 | 0.256281 | 0.065212 | 15698 | -117.867587 | 36.081665 | 5.939931 |
9 | 2019-07-04 11:51:07.805591259 | 0.692042 | 0.075023 | 48 | 0.818316 | 0.102087 | 4380 | -117.671909 | 36.118573 | 6.085804 |
10 | 2019-07-04 15:36:04.228420696 | 0.652888 | 0.012137 | 9 | -0.633904 | 0.167058 | 27370 | -117.785299 | 36.005578 | 3.234981 |
11 | 2019-07-04 15:42:47.932558745 | 0.597856 | 0.072740 | 28 | 0.933088 | 0.151623 | 9740 | -117.501116 | 35.707382 | 14.496446 |
12 | 2019-07-04 16:07:20.003321194 | 0.729246 | 0.093399 | 33 | 0.835854 | 0.133106 | 9365 | -117.491503 | 35.711241 | 13.846898 |
13 | 2019-07-04 16:11:46.920083440 | 0.711579 | 0.516367 | 18 | 1.734816 | 0.117952 | 7098 | -117.877675 | 35.196445 | 31.000000 |
14 | 2019-07-04 16:13:43.094792540 | 0.849673 | 0.070048 | 83 | 1.653859 | 0.111108 | 2305 | -117.493716 | 35.710090 | 13.375377 |
15 | 2019-07-04 16:16:07.085486307 | 0.814365 | 0.040161 | 10 | 0.587583 | 0.099619 | 26520 | -117.540198 | 35.689022 | 14.537368 |
16 | 2019-07-04 17:02:55.057058245 | 0.770696 | 0.079675 | 90 | 4.476724 | 0.143489 | 1101 | -117.495094 | 35.711607 | 13.586411 |
17 | 2019-07-04 17:04:02.231614981 | 0.664676 | 0.064038 | 41 | 2.062226 | 0.186214 | 5379 | -117.488446 | 35.711139 | 14.001705 |
18 | 2019-07-04 17:05:05.071421677 | 0.509088 | 0.089882 | 29 | 1.471434 | 0.157048 | 12115 | -117.491307 | 35.710881 | 13.223374 |
19 | 2019-07-04 17:08:51.664841725 | 0.666790 | 0.046444 | 15 | 0.657653 | 0.175270 | 23044 | -117.504506 | 35.706123 | 14.570921 |
# feel free to play with the event index to plot different events
EVENT_IDX = 1101
event_meta.set_index("event_index").loc[EVENT_IDX]
time 2019-07-04 17:02:55.057058245 adloc_score 0.770696 adloc_residual_time 0.079675 num_picks 90 magnitude 4.476724 adloc_residual_amplitude 0.143489 longitude -117.495094 latitude 35.711607 depth_km 13.586411 Name: 1101, dtype: object
# fetch the corresponding picks for this event
event_picks = picks[picks["event_index"] == EVENT_IDX]
event_picks
station_id | phase_index | phase_time | phase_score | phase_type | dt_s | phase_polarity | phase_amplitude | event_index | sp_ratio | event_amplitude | adloc_mask | adloc_residual_time | adloc_residual_amplitude | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
565 | CI.CLC..HN | 6137848 | 2019-07-04 17:02:58.488 | 0.969 | P | 0.01 | -0.879 | -0.511873 | 1101 | 1.017569 | 2.474507e-07 | 1 | -0.010048 | -0.054426 |
566 | CI.CLC..BH | 6137847 | 2019-07-04 17:02:58.489 | 0.633 | P | 0.01 | -0.324 | -0.666956 | 1101 | 1.014822 | 2.022085e-07 | 1 | 0.048153 | -0.127168 |
567 | CI.CLC..HH | 6137849 | 2019-07-04 17:02:58.498 | 0.965 | P | 0.01 | -0.883 | -0.586030 | 1101 | 1.086886 | 2.141392e-07 | 1 | -0.007135 | -0.236796 |
568 | CI.SRT..HN | 6137990 | 2019-07-04 17:02:59.908 | 0.879 | P | 0.01 | 0.699 | -0.643401 | 1101 | 1.377537 | 4.387864e-07 | 1 | -0.069463 | 0.065039 |
569 | CI.SRT..HH | 6137990 | 2019-07-04 17:02:59.908 | 0.891 | P | 0.01 | 0.734 | -0.601019 | 1101 | 0.729745 | 5.910421e-07 | 1 | -0.078144 | 0.020251 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
650 | CI.WMF..HH | 6139185 | 2019-07-04 17:03:11.858 | 0.672 | S | 0.01 | 0.021 | -1.368759 | 1101 | 1.096820 | 3.073536e-07 | 1 | 0.140179 | -0.329117 |
651 | CI.WMF..HN | 6139189 | 2019-07-04 17:03:11.898 | 0.633 | S | 0.01 | 0.026 | -1.371407 | 1101 | 0.989939 | 2.891483e-07 | 1 | 0.159858 | -0.171350 |
652 | CI.DTP..BH | 6139200 | 2019-07-04 17:03:12.019 | 0.734 | S | 0.01 | 0.035 | -1.322484 | 1101 | 1.385395 | 1.624267e-07 | 1 | 0.068940 | 0.175198 |
653 | CI.DTP..HH | 6139203 | 2019-07-04 17:03:12.038 | 0.707 | S | 0.01 | -0.015 | -1.306977 | 1101 | 1.122809 | 3.205840e-07 | 1 | 0.052824 | 0.125587 |
654 | CI.DTP..HN | 6139211 | 2019-07-04 17:03:12.118 | 0.625 | S | 0.01 | 0.043 | -1.282829 | 1101 | 0.863380 | 2.702479e-07 | 1 | 0.105026 | 0.211253 |
90 rows × 14 columns
# fetch the waveforms
event_waveforms = fetch_event_waveforms(event_picks, time_before_phase_onset_sec=10., duration_sec=30.)
print(event_waveforms.__str__(extended=True))
42 Trace(s) in Stream: CI.CLC..HHZ | 2019-07-04T17:02:48.478300Z - 2019-07-04T17:03:18.478300Z | 25.0 Hz, 751 samples CI.SRT..HHZ | 2019-07-04T17:02:49.918300Z - 2019-07-04T17:03:19.918300Z | 25.0 Hz, 751 samples CI.CCC..HHZ | 2019-07-04T17:02:50.118300Z - 2019-07-04T17:03:20.118300Z | 25.0 Hz, 751 samples CI.SLA..HHZ | 2019-07-04T17:02:50.518300Z - 2019-07-04T17:03:20.518300Z | 25.0 Hz, 751 samples CI.TOW2..HHZ | 2019-07-04T17:02:50.518300Z - 2019-07-04T17:03:20.518300Z | 25.0 Hz, 751 samples CI.LRL..HHZ | 2019-07-04T17:02:50.638300Z - 2019-07-04T17:03:20.638300Z | 25.0 Hz, 751 samples CI.WRC2..HHZ | 2019-07-04T17:02:50.718300Z - 2019-07-04T17:03:20.718300Z | 25.0 Hz, 751 samples CI.CLC..HHN | 2019-07-04T17:02:50.958300Z - 2019-07-04T17:03:20.958300Z | 25.0 Hz, 751 samples CI.CLC..HHE | 2019-07-04T17:02:50.958300Z - 2019-07-04T17:03:20.958300Z | 25.0 Hz, 751 samples CI.MPM..HHZ | 2019-07-04T17:02:52.038300Z - 2019-07-04T17:03:22.038300Z | 25.0 Hz, 751 samples CI.WBM..HHZ | 2019-07-04T17:02:52.123100Z - 2019-07-04T17:03:22.123100Z | 25.0 Hz, 751 samples CI.WVP2..EHZ | 2019-07-04T17:02:52.160000Z - 2019-07-04T17:03:22.160000Z | 25.0 Hz, 751 samples CI.WNM..EHZ | 2019-07-04T17:02:52.160000Z - 2019-07-04T17:03:22.160000Z | 25.0 Hz, 751 samples CI.JRC2..HHZ | 2019-07-04T17:02:52.558300Z - 2019-07-04T17:03:22.558300Z | 25.0 Hz, 751 samples CI.WCS2..HHZ | 2019-07-04T17:02:52.598300Z - 2019-07-04T17:03:22.598300Z | 25.0 Hz, 751 samples CI.WRV2..EHZ | 2019-07-04T17:02:53.600000Z - 2019-07-04T17:03:23.600000Z | 25.0 Hz, 751 samples CI.SRT..HHN | 2019-07-04T17:02:53.838300Z - 2019-07-04T17:03:23.838300Z | 25.0 Hz, 751 samples CI.SRT..HHE | 2019-07-04T17:02:53.838300Z - 2019-07-04T17:03:23.838300Z | 25.0 Hz, 751 samples CI.CCC..HHN | 2019-07-04T17:02:54.078300Z - 2019-07-04T17:03:24.078300Z | 25.0 Hz, 751 samples CI.CCC..HHE | 2019-07-04T17:02:54.078300Z - 2019-07-04T17:03:24.078300Z | 25.0 Hz, 751 samples CI.SLA..HHE | 2019-07-04T17:02:54.638300Z - 2019-07-04T17:03:24.638300Z | 25.0 Hz, 751 samples CI.SLA..HHN | 2019-07-04T17:02:54.638300Z - 2019-07-04T17:03:24.638300Z | 25.0 Hz, 751 samples CI.LRL..HHN | 2019-07-04T17:02:54.718300Z - 2019-07-04T17:03:24.718300Z | 25.0 Hz, 751 samples CI.LRL..HHE | 2019-07-04T17:02:54.718300Z - 2019-07-04T17:03:24.718300Z | 25.0 Hz, 751 samples CI.WMF..HHZ | 2019-07-04T17:02:54.798300Z - 2019-07-04T17:03:24.798300Z | 25.0 Hz, 751 samples CI.DTP..HHZ | 2019-07-04T17:02:54.958300Z - 2019-07-04T17:03:24.958300Z | 25.0 Hz, 751 samples CI.TOW2..HHN | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples CI.TOW2..HHE | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples CI.WRC2..HHN | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples CI.WRC2..HHE | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples CI.WBM..HHE | 2019-07-04T17:02:57.243100Z - 2019-07-04T17:03:27.243100Z | 25.0 Hz, 751 samples CI.WBM..HHN | 2019-07-04T17:02:57.243100Z - 2019-07-04T17:03:27.243100Z | 25.0 Hz, 751 samples CI.MPM..HHE | 2019-07-04T17:02:57.318300Z - 2019-07-04T17:03:27.318300Z | 25.0 Hz, 751 samples CI.MPM..HHN | 2019-07-04T17:02:57.318300Z - 2019-07-04T17:03:27.318300Z | 25.0 Hz, 751 samples CI.JRC2..HHN | 2019-07-04T17:02:57.958300Z - 2019-07-04T17:03:27.958300Z | 25.0 Hz, 751 samples CI.JRC2..HHE | 2019-07-04T17:02:57.958300Z - 2019-07-04T17:03:27.958300Z | 25.0 Hz, 751 samples CI.WCS2..HHE | 2019-07-04T17:02:58.118300Z - 2019-07-04T17:03:28.118300Z | 25.0 Hz, 751 samples CI.WCS2..HHN | 2019-07-04T17:02:58.118300Z - 2019-07-04T17:03:28.118300Z | 25.0 Hz, 751 samples CI.WMF..HHN | 2019-07-04T17:03:01.838300Z - 2019-07-04T17:03:31.838300Z | 25.0 Hz, 751 samples CI.WMF..HHE | 2019-07-04T17:03:01.838300Z - 2019-07-04T17:03:31.838300Z | 25.0 Hz, 751 samples CI.DTP..HHN | 2019-07-04T17:03:02.038300Z - 2019-07-04T17:03:32.038300Z | 25.0 Hz, 751 samples CI.DTP..HHE | 2019-07-04T17:03:02.038300Z - 2019-07-04T17:03:32.038300Z | 25.0 Hz, 751 samples
# plot them!
fig = event_waveforms.select(component="Z").plot(equal_scale=False)
Run template matching¶
We will now use one of the events from the PhaseNet catalog as a template event to detect events with template matching.
selected_event_meta = event_meta.set_index("event_index").loc[EVENT_IDX]
selected_event_meta
time 2019-07-04 17:02:55.057058245 adloc_score 0.770696 adloc_residual_time 0.079675 num_picks 90 magnitude 4.476724 adloc_residual_amplitude 0.143489 longitude -117.495094 latitude 35.711607 depth_km 13.586411 Name: 1101, dtype: object
selected_event_picks = picks[picks["event_index"] == EVENT_IDX]
selected_event_picks
station_id | phase_index | phase_time | phase_score | phase_type | dt_s | phase_polarity | phase_amplitude | event_index | sp_ratio | event_amplitude | adloc_mask | adloc_residual_time | adloc_residual_amplitude | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
565 | CI.CLC..HN | 6137848 | 2019-07-04 17:02:58.488 | 0.969 | P | 0.01 | -0.879 | -0.511873 | 1101 | 1.017569 | 2.474507e-07 | 1 | -0.010048 | -0.054426 |
566 | CI.CLC..BH | 6137847 | 2019-07-04 17:02:58.489 | 0.633 | P | 0.01 | -0.324 | -0.666956 | 1101 | 1.014822 | 2.022085e-07 | 1 | 0.048153 | -0.127168 |
567 | CI.CLC..HH | 6137849 | 2019-07-04 17:02:58.498 | 0.965 | P | 0.01 | -0.883 | -0.586030 | 1101 | 1.086886 | 2.141392e-07 | 1 | -0.007135 | -0.236796 |
568 | CI.SRT..HN | 6137990 | 2019-07-04 17:02:59.908 | 0.879 | P | 0.01 | 0.699 | -0.643401 | 1101 | 1.377537 | 4.387864e-07 | 1 | -0.069463 | 0.065039 |
569 | CI.SRT..HH | 6137990 | 2019-07-04 17:02:59.908 | 0.891 | P | 0.01 | 0.734 | -0.601019 | 1101 | 0.729745 | 5.910421e-07 | 1 | -0.078144 | 0.020251 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
650 | CI.WMF..HH | 6139185 | 2019-07-04 17:03:11.858 | 0.672 | S | 0.01 | 0.021 | -1.368759 | 1101 | 1.096820 | 3.073536e-07 | 1 | 0.140179 | -0.329117 |
651 | CI.WMF..HN | 6139189 | 2019-07-04 17:03:11.898 | 0.633 | S | 0.01 | 0.026 | -1.371407 | 1101 | 0.989939 | 2.891483e-07 | 1 | 0.159858 | -0.171350 |
652 | CI.DTP..BH | 6139200 | 2019-07-04 17:03:12.019 | 0.734 | S | 0.01 | 0.035 | -1.322484 | 1101 | 1.385395 | 1.624267e-07 | 1 | 0.068940 | 0.175198 |
653 | CI.DTP..HH | 6139203 | 2019-07-04 17:03:12.038 | 0.707 | S | 0.01 | -0.015 | -1.306977 | 1101 | 1.122809 | 3.205840e-07 | 1 | 0.052824 | 0.125587 |
654 | CI.DTP..HN | 6139211 | 2019-07-04 17:03:12.118 | 0.625 | S | 0.01 | 0.043 | -1.282829 | 1101 | 0.863380 | 2.702479e-07 | 1 | 0.105026 | 0.211253 |
90 rows × 14 columns
Read data from same day¶
def fetch_day_waveforms(dir_waveforms):
"""
Fetches the continuous seismograms for a given day.
Parameters
----------
dir_waveforms : str
Directory where the waveform data is stored, by default DIR_WAVEFORMS.
Returns
-------
obspy.Stream
Stream object containing the fetched continuous seismograms.
"""
stream = obs.Stream()
files = glob.glob(os.path.join(dir_waveforms, "*mseed"))
for _file in files:
stream += obs.read(_file)
return stream
# first, read the continuous seismograms into an `obspy.Stream`
continuous_seismograms = fetch_day_waveforms(DIR_WAVEFORMS)
print(continuous_seismograms.__str__(extended=True))
57 Trace(s) in Stream: CI.WCS2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B916..EHZ | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples CI.CLC..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WRV2..EHZ | 2019-07-04T00:00:00.000000Z - 2019-07-04T23:59:59.960000Z | 25.0 Hz, 2160000 samples PB.B916..EH2 | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples PB.B917..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WMF..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B918..EHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B918..EH2 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.TOW2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.SLA..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WRC2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.JRC2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.CCC..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DTP..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.SRT..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.LRL..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.SLA..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.MPM..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DTP..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WBM..HHZ | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples CI.WCS2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.CLC..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B921..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DAW..HHZ | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples CI.CLC..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WMF..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.TOW2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DAW..HHE | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples PB.B918..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WRC2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.JRC2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.CCC..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B916..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.MPM..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DTP..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WBM..HHE | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples CI.SRT..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B917..EHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WVP2..EHZ | 2019-07-04T00:00:00.000000Z - 2019-07-04T23:59:59.960000Z | 25.0 Hz, 2160000 samples PB.B917..EH2 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.LRL..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.CCC..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.SLA..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WNM..EHZ | 2019-07-04T00:00:00.000000Z - 2019-07-04T23:59:59.960000Z | 25.0 Hz, 2160000 samples CI.WRC2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.JRC2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.SRT..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.LRL..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.MPM..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B921..EHZ | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples CI.WBM..HHN | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples PB.B921..EH2 | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples CI.WMF..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WCS2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.TOW2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DAW..HHN | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples
Introduce gaps in some stations¶
GAP_START_SEC = 3. * 60. * 60.
GAP_END_SEC = 14. * 60. * 60.
station_list = list(set([st.stats.station for st in continuous_seismograms]))
STATIONS_W_GAP = np.random.choice(
station_list, size=int(3. / 4. * len(station_list)), replace=False
)
for sta in STATIONS_W_GAP:
for tr in continuous_seismograms.select(station=sta):
gap_start_samp = int(GAP_START_SEC * tr.stats.sampling_rate)
gap_end_samp = int(GAP_END_SEC * tr.stats.sampling_rate)
tr.data[gap_start_samp:gap_end_samp] = 0.
# plot the continuous seismograms from a single station
fig = continuous_seismograms.select(station="CLC").plot()
# then, cast data into `numpy.ndarray`
station_codes = list(set([st.stats.station for st in continuous_seismograms]))
component_codes = ["N", "E", "Z"]
component_aliases={"E": ["E", "2"], "N": ["N", "1"], "Z": ["Z"]}
num_stations = len(station_codes)
num_channels = len(component_codes)
num_samples = len(continuous_seismograms[0].data)
continuous_seismograms_arr = np.zeros((num_stations, num_channels, num_samples), dtype=np.float32)
for s, sta in enumerate(station_codes):
for c, cp in enumerate(component_codes):
for cp_alias in component_aliases[cp]:
sel_seismogram = continuous_seismograms.select(station=sta, component=cp_alias)
if len(sel_seismogram) > 0:
continuous_seismograms_arr[s, c, :] = sel_seismogram[0].data
break
continuous_seismograms_arr
array([[[-9.8226795e-11, 5.7554624e-11, 3.4072033e-11, ..., -1.8828223e-09, 8.1136459e-10, -1.7122942e-10], [-1.7923078e-10, 9.7317022e-11, 3.3442499e-10, ..., 7.6900075e-10, 2.7582496e-09, -1.1529546e-09], [-1.6065722e-10, -2.5988860e-11, -8.8432366e-11, ..., 6.2743988e-10, 3.0330730e-10, -1.6875973e-10]], [[ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], [ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], [ 1.8492177e-11, 7.3153331e-11, 1.3191939e-10, ..., 4.4108817e-10, -3.0806005e-10, -4.0270010e-10]], [[-1.6893502e-11, 8.4201196e-12, 2.7799072e-11, ..., -1.7474208e-10, -4.5014548e-11, -2.6048433e-10], [ 1.4196522e-11, 3.8935088e-12, -2.4043532e-12, ..., -4.9603099e-11, -9.2666923e-11, 2.2229388e-10], [-7.8133723e-12, 1.1610854e-11, 1.2903780e-11, ..., -8.2966682e-11, -9.3903114e-11, -8.1583212e-12]], ..., [[ 5.4725401e-11, 7.0578925e-11, -6.2566233e-12, ..., -3.6589140e-10, -7.8155921e-10, -6.3879835e-10], [-1.3925512e-11, 2.0290504e-11, 1.0709553e-11, ..., 1.0249724e-09, 7.7783147e-10, -4.5876902e-10], [ 1.9426023e-10, -3.3486833e-10, -1.7510682e-10, ..., 9.0171864e-10, -9.0758560e-11, -1.7541314e-10]], [[-6.0105490e-11, -1.9033368e-09, -2.2194966e-09, ..., 9.9305908e-10, -2.5555899e-10, -1.2649487e-09], [ 3.3998053e-09, -9.0392399e-10, -4.9158380e-09, ..., -5.5282345e-10, -3.0974114e-09, -1.6373071e-09], [-6.3921446e-10, -1.2163219e-09, -1.2187444e-09, ..., -1.0883094e-09, 2.2945404e-10, 4.6940152e-10]], [[ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], [ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], [-2.3801865e-11, 4.4190979e-12, -4.3637774e-11, ..., -4.7749610e-11, -1.3966529e-10, -1.5340888e-10]]], dtype=float32)
Build template¶
------------------ Background ------------------¶
A template is a collection of waveforms at different channels, $T_{s,c}(t)$, which are clips taken from the continuous seismograms, $u_{s,c}$. These clips are taken at times defined by: $$ u_{s,c}(t)\ |\ t \in \lbrace \tau_{s,c}; \tau_{s,c} + D \rbrace, $$ where $\tau_{s,c}$ is the start time of the template window and $D$ is the template duration.
$\tau_{s,c}$ is given by some prior information on the event: picks or modeled arrival times. The moveouts, $\tilde{\tau}_{s,c}$, are the collection of delay times relative to the earliest $\tau_{s,c}$: $$ \tilde{\tau}_{s,c} = \tau_{s,c} - \underset{s,c}{\min} \lbrace \tau_{s,c} \rbrace .$$
------------------ On the necessity to clip template waveforms out of numpy.ndarray
instead of obspy.Stream
------------------¶
Looking carefully at the output of print(continuous_seismograms.__str__(extended=True))
, a few cells before, we see that start times are generally not exactly at midnight. This is a consequence of the discrete nature of the continuous seismograms (here, sampled at 25 samples per second). Thus, in general, the $\tau_{s,c}$ computed from picks or modeled arrival times fall in between two samples of the seismograms.
When running a matched-filter search, we need to make sure the moveouts, $\tilde{\tau}_{s,c}$, ultimately expressed in samples, match exactly the times that were used when clipping the template waveforms out of $u_{s,c}$. One way to ensure this is to first cast the $\tau_{s,c}$ to times in samples and then operate exclusively on the numpy.ndarray
: continuous_seismograms_arr
:
$$ T_{s,c}[t_n] = u_{s,c}[\tau_{s,c} + n \Delta t],$$ where $\Delta t$ is the sampling time.
Clip out waveforms and moveout and station-weight arrays¶
# PHASE_ON_COMP: dictionary defining which moveout we use to extract the waveform.
# Here, we use windows centered around the S wave for horizontal components
# and windows starting 1sec before the P wave for the vertical component.
PHASE_ON_COMP = {"N": "S", "E": "S", "Z": "P"}
# OFFSET_PHASE_SEC: dictionary defining the time offset taken before a given phase
# for example OFFSET_PHASE_SEC["P"] = 1.0 means that we extract the window
# 1 second before the predicted P arrival time
OFFSET_PHASE_SEC = {"P": 1.0, "S": 4.0}
# TEMPLATE_DURATION_SEC
TEMPLATE_DURATION_SEC = 8.
# SAMPLING_RATE_HZ
SAMPLING_RATE_HZ = 25.
# TEMPLATE_DURATION_SAMP
TEMPLATE_DURATION_SAMP = int(TEMPLATE_DURATION_SEC * SAMPLING_RATE_HZ)
# add station_code columns to `selected_event_picks`
selected_event_picks.set_index("station_id", inplace=True)
for staid in selected_event_picks.index:
station_code = staid.split(".")[1]
selected_event_picks.loc[staid, "station_code"] = station_code
selected_event_picks
/tmp/ipykernel_2902838/1881751651.py:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy selected_event_picks.loc[staid, "station_code"] = station_code
phase_index | phase_time | phase_score | phase_type | dt_s | phase_polarity | phase_amplitude | event_index | sp_ratio | event_amplitude | adloc_mask | adloc_residual_time | adloc_residual_amplitude | station_code | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
station_id | ||||||||||||||
CI.CLC..HN | 6137848 | 2019-07-04 17:02:58.488 | 0.969 | P | 0.01 | -0.879 | -0.511873 | 1101 | 1.017569 | 2.474507e-07 | 1 | -0.010048 | -0.054426 | CLC |
CI.CLC..BH | 6137847 | 2019-07-04 17:02:58.489 | 0.633 | P | 0.01 | -0.324 | -0.666956 | 1101 | 1.014822 | 2.022085e-07 | 1 | 0.048153 | -0.127168 | CLC |
CI.CLC..HH | 6137849 | 2019-07-04 17:02:58.498 | 0.965 | P | 0.01 | -0.883 | -0.586030 | 1101 | 1.086886 | 2.141392e-07 | 1 | -0.007135 | -0.236796 | CLC |
CI.SRT..HN | 6137990 | 2019-07-04 17:02:59.908 | 0.879 | P | 0.01 | 0.699 | -0.643401 | 1101 | 1.377537 | 4.387864e-07 | 1 | -0.069463 | 0.065039 | SRT |
CI.SRT..HH | 6137990 | 2019-07-04 17:02:59.908 | 0.891 | P | 0.01 | 0.734 | -0.601019 | 1101 | 0.729745 | 5.910421e-07 | 1 | -0.078144 | 0.020251 | SRT |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
CI.WMF..HH | 6139185 | 2019-07-04 17:03:11.858 | 0.672 | S | 0.01 | 0.021 | -1.368759 | 1101 | 1.096820 | 3.073536e-07 | 1 | 0.140179 | -0.329117 | WMF |
CI.WMF..HN | 6139189 | 2019-07-04 17:03:11.898 | 0.633 | S | 0.01 | 0.026 | -1.371407 | 1101 | 0.989939 | 2.891483e-07 | 1 | 0.159858 | -0.171350 | WMF |
CI.DTP..BH | 6139200 | 2019-07-04 17:03:12.019 | 0.734 | S | 0.01 | 0.035 | -1.322484 | 1101 | 1.385395 | 1.624267e-07 | 1 | 0.068940 | 0.175198 | DTP |
CI.DTP..HH | 6139203 | 2019-07-04 17:03:12.038 | 0.707 | S | 0.01 | -0.015 | -1.306977 | 1101 | 1.122809 | 3.205840e-07 | 1 | 0.052824 | 0.125587 | DTP |
CI.DTP..HN | 6139211 | 2019-07-04 17:03:12.118 | 0.625 | S | 0.01 | 0.043 | -1.282829 | 1101 | 0.863380 | 2.702479e-07 | 1 | 0.105026 | 0.211253 | DTP |
90 rows × 14 columns
In the following cell, we build the numpy.ndarray
of moveouts $\tilde{\tau}_{s,c}$, expressed in units of samples.
# first, we extract the set of relative delay times of the beginning of each
# template window on a given station and component
tau_s_c_sec = np.zeros((num_stations, num_channels), dtype=np.float64)
for s, sta in enumerate(station_codes):
for c, cp in enumerate(component_codes):
phase_type = PHASE_ON_COMP[cp]
picks_s_c = selected_event_picks[
(
(selected_event_picks["station_code"] == sta)
& (selected_event_picks["phase_type"] == phase_type)
)
]
if len(picks_s_c) == 0:
# no pick for this station/component: set to -999
tau_s_c_sec[s, c] = -999
elif len(picks_s_c) == 1:
# express pick relative to beginning of day (midnight)
_pick = pd.Timestamp(picks_s_c["phase_time"])
_relative_pick_sec = (_pick - pd.Timestamp(_pick.strftime("%Y-%m-%d"))).total_seconds()
tau_s_c_sec[s, c] = _relative_pick_sec - OFFSET_PHASE_SEC[phase_type]
else:
# there were several picks from different channels: average them
_relative_pick_sec = 0.
for _pick in picks_s_c["phase_time"].values:
_pick = pd.Timestamp(_pick)
_relative_pick_sec += (_pick - pd.Timestamp(_pick.strftime("%Y-%m-%d"))).total_seconds()
_relative_pick_sec /= float(len(picks_s_c["phase_time"]))
tau_s_c_sec[s, c] = _relative_pick_sec - OFFSET_PHASE_SEC[phase_type]
# now, we convert these relative times into samples
# and express them relative to the earliest time
# we also store in memory the minimum time offset `tau_min_samp` for the next step
moveouts_samp_arr = (tau_s_c_sec * SAMPLING_RATE_HZ).astype(np.int64)
tau_min_samp = np.min(moveouts_samp_arr[moveouts_samp_arr > 0])
moveouts_samp_arr = moveouts_samp_arr - tau_min_samp
moveouts_samp_arr
array([[ 94, 94, 66], [ 160, 160, 105], [-1559399, -1559399, -1559399], [ 159, 159, 102], [ 272, 272, 171], [ 179, 179, 116], [ 101, 101, 68], [-1559399, -1559399, -1559399], [ 78, 78, 53], [ 277, 277, 174], [ 225, 225, 140], [ 0, 0, 13], [-1559399, -1559399, -1559399], [ 102, 102, 64], [-1559399, -1559399, -1559399], [ 72, 72, 48], [-1559399, -1559399, -1559399], [ 175, 175, 115], [ 156, 156, 104], [ 91, 91, 62], [ 165, 165, 105]])
Next, we use the moveouts, in samples, to clip out the relevant template waveforms from the continuous seismograms.
template_waveforms_arr = np.zeros((num_stations, num_channels, TEMPLATE_DURATION_SAMP), dtype=np.float32)
weights_arr = np.ones((num_stations, num_channels), dtype=np.float32)
for s, sta in enumerate(station_codes):
for c, cp in enumerate(component_codes):
if moveouts_samp_arr[s, c] < 0:
# no picks were found on this station
weights_arr[s, c] = 0.
continue
starttime = tau_min_samp + moveouts_samp_arr[s, c]
endtime = starttime + TEMPLATE_DURATION_SAMP
template_waveforms_arr[s, c, :] = continuous_seismograms_arr[s, c, starttime:endtime]
if template_waveforms_arr[s, c, :].sum() == 0.:
# no data was available on this channel
weights_arr[s, c] = 0.
template_waveforms_arr
array([[[-7.95800133e-06, 8.55314720e-05, 2.11978477e-04, ..., -3.52683623e-04, 9.79480028e-05, 1.74769040e-04], [-1.22021929e-05, 4.51964515e-05, 1.12187183e-04, ..., -4.29799955e-04, -1.41114127e-04, 4.94585664e-04], [ 2.51332494e-07, 1.18968394e-07, -9.30824982e-08, ..., -8.89477014e-05, 8.27156691e-05, 1.22664089e-04]], [[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 1.27826226e-07, 1.54385773e-07, 1.52002073e-07, ..., 1.35540016e-04, -3.16935242e-04, -1.91862506e-04]], [[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]], ..., [[-1.15102615e-04, -2.08503159e-04, -8.57295672e-05, ..., -3.02342505e-05, 1.90505059e-04, 2.32697013e-04], [-6.95939161e-06, -1.14591476e-05, -4.32817069e-05, ..., 1.95618981e-04, -6.08503397e-05, -1.64061450e-04], [-3.15401110e-08, 6.19310697e-07, 1.24642884e-06, ..., -6.75416959e-05, 9.95227219e-06, 9.39235106e-05]], [[ 1.31645243e-06, 1.41594983e-05, 1.18550970e-05, ..., -1.61218868e-05, 1.20281387e-04, 7.25206701e-05], [ 1.01139749e-05, 3.06667994e-06, 6.11646101e-07, ..., 5.91952994e-04, 5.78140549e-04, 2.96404498e-04], [ 5.31832107e-08, 8.85903688e-08, 1.39000178e-07, ..., 3.86358661e-05, 4.55262576e-04, 2.83642818e-04]], [[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 6.11326456e-08, 1.84645728e-07, 2.68150814e-07, ..., -1.40672455e-05, 9.12318046e-06, 1.10272435e-06]]], dtype=float32)
# normalize template waveforms for numerical reasons
norm = np.std(template_waveforms_arr, axis=-1, keepdims=True)
norm[norm == 0.] = 1.
template_waveforms_arr /= norm
# normalize weights so that they sum up to one
weights_arr /= np.sum(weights_arr)
# normalize continuous seismograms for numerical reasons
norm = np.std(continuous_seismograms_arr, axis=-1, keepdims=True)
norm[norm == 0.] = 1.
continuous_seismograms_arr /= norm
Run FMF¶
After all this data formatting, we can now run template matching (also called matched-filtering) to detect new events that are similar to our template event.
For that, use the software Fast Matched Filter (FMF): https://github.com/beridel/fast_matched_filter
FMF offers C and CUDA-C routines to efficiently run template matching on CPUs, or even on GPUs if available to you.
# FMF_STEP_SAMP: this is the step between two consecutive calculation of the correlation coefficient
FMF_STEP_SAMP = 1
# ARCH: it determines whether you want to use GPUs or CPUs
# If you do not have an Nvidia GPU, set ARCH = "cpu"
ARCH = "gpu"
The following cell computes the time series of correlation coefficients between the template waveforms, $T_{s,c}$, and the continuous seismograms, $u_{s,c}$: $$ CC(t) = \sum_{s,c} w_{s,c} \sum_{i=1}^N \dfrac{T^*_{s,c}(n \Delta t) u^*_{s,c}(t + \tilde{\tau}_{s,c} + n \Delta t)}{\sqrt{\sum_{i=1}^N {T^*_{s,c}}^2(n \Delta t) \sum_{i=1}^N {u^*_{s,c}}^2(t + \tilde{\tau}_{s,c} + n \Delta t)}}, $$ with:
- $T^*_{s,c} = T_{s,c} - \frac{1}{N} \sum_{i=1}^N T_{s,c}(n \Delta t)$,
- $u^*_{s,c}(t) = u_{s,c}(t) - \frac{1}{N} \sum_{i=1}^N u_{s,c}(t + n \Delta t)$.
Note that because the seismograms were filtered below periods that are shorter than the template window ($N \Delta t$) we have $T^*_{s,c} \approx T_{s,c}$ and $u^*_{s,c} \approx u_{s,c}$, which spares us the computation of the mean in each sliding window.
cc = fmf.matched_filter(
template_waveforms_arr.astype(np.float32),
moveouts_samp_arr.astype(np.int32),
weights_arr.astype(np.float32),
continuous_seismograms_arr.astype(np.float32),
FMF_STEP_SAMP,
arch=ARCH,
)
# FMF is programmed to handle multiple templates at once. Here, we only used
# a single template, hence the size of the outermost axis of "1"
cc.shape
(1, 2159801)
# let's print the output of our template matching run, which a time series of network-averaged correlation coefficients
# of same duration as the continuous seismograms
_cc = cc[0, :]
time_cc = np.arange(len(_cc)) / SAMPLING_RATE_HZ
fig = plt.figure("network_averaged_cc", figsize=(20, 6))
gs = fig.add_gridspec(ncols=4)
ax1 = fig.add_subplot(gs[:3])
ax1.plot(time_cc, _cc, lw=0.75)
ax1.set_xlabel("Elapsed time (sec)")
ax1.set_ylabel("Network-averaged cc")
ax1.set_xlim(time_cc.min(), time_cc.max())
ax1.set_title("Time series of template/continuous seismograms similarity")
ax2 = fig.add_subplot(gs[3], sharey=ax1)
_ = ax2.hist(_cc, orientation="horizontal", bins=250, density=True)
ax2.set_xlabel("Normalized count")
ax2.set_title("Histogram")
for ax in [ax1, ax2]:
ax.grid()
Set detection threshold and find events¶
We will use the time series of correlation coefficients to build an earthquake catalog. For that, we need to set a detection threshold and define all times above that threshold as triggers caused by near-repeats of the template event.
def select_cc_indexes(
cc_t,
threshold,
search_win,
):
"""Select the peaks in the CC time series.
Parameters
------------
cc_t: (n_corr,) numpy.ndarray
The CC time series for one template.
threshold: (n_corr,) numpy.ndarray or scalar
The detection threshold.
search_win: scalar int
The minimum inter-event time, in units of correlation step.
Returns
--------
cc_idx: (n_detections,) numpy.ndarray
The list of all selected CC indexes. They give the timings of the
detected events.
"""
cc_detections = cc_t > threshold
cc_idx = np.where(cc_detections)[0]
cc_idx = list(cc_idx)
n_rm = 0
for i in range(1, len(cc_idx)):
if (cc_idx[i - n_rm] - cc_idx[i - n_rm - 1]) < search_win:
if cc_t[cc_idx[i - n_rm]] > cc_t[cc_idx[i - n_rm - 1]]:
# keep (i-n_rm)-th detection
cc_idx.remove(cc_idx[i - n_rm - 1])
else:
# keep (i-n_rm-1)-th detection
cc_idx.remove(cc_idx[i - n_rm])
n_rm += 1
cc_idx = np.asarray(cc_idx)
return cc_idx
# INTEREVENT_TIME_RESOLUTION_SEC: In some cases, a template might trigger multiple, closely spaced detections because
# of a phenomenon similar to that of "cycle skipping", where the waveform correlates
# well with a time-shifted version of itself. Thus, to avoid redundant detections, we
# set a minimum time separation between triggers (rule of thumb: about half the template duration)
INTEREVENT_TIME_RESOLUTION_SEC = 5.
INTEREVENT_TIME_RESOLUTION_SAMP = int(INTEREVENT_TIME_RESOLUTION_SEC * SAMPLING_RATE_HZ)
_cc = cc[0, :]
time_cc = np.arange(len(_cc)) * FMF_STEP_SAMP / SAMPLING_RATE_HZ
NUM_RMS = 8.
detection_threshold = NUM_RMS * np.std(_cc)
event_cc_indexes = select_cc_indexes(_cc, detection_threshold, INTEREVENT_TIME_RESOLUTION_SAMP)
fig = plt.figure("network_averaged_cc", figsize=(20, 6))
gs = fig.add_gridspec(ncols=4)
ax1 = fig.add_subplot(gs[:3])
ax1.plot(time_cc, _cc, lw=0.75)
ax1.scatter(time_cc[event_cc_indexes], _cc[event_cc_indexes], linewidths=0.25, edgecolor="k", color="r", zorder=2)
ax1.set_xlabel("Elapsed time (sec)")
ax1.set_ylabel("Network-averaged cc")
ax1.set_xlim(time_cc.min(), time_cc.max())
ax1.set_title("Time series of template/continuous seismograms similarity")
ax2 = fig.add_subplot(gs[3], sharey=ax1)
_ = ax2.hist(_cc, orientation="horizontal", bins=250, density=True, zorder=2)
ax2.set_xlabel("Normalized count")
ax2.set_title("Histogram")
label = f"Detection threshold: {NUM_RMS:.0f}"r"$\times \mathrm{RMS}(\mathrm{CC}(t))$"f"\n{len(event_cc_indexes):d} detected events"
for ax in [ax1, ax2]:
ax.grid()
ax.axhline(
detection_threshold, ls="--", color="r",
label=label
)
ax1.legend(loc="upper left")
<matplotlib.legend.Legend at 0x7fae4e49bb20>
The time variation of the standard deviations of $CC(t)$ caused by gaps in some of the stations may lower the detection threshold and may thus trigger many false detections. In general, it's better to use a time-dependent detection threshold to adapt to possible gaps in the data. When using template matching on smaller seismic networks than in this example, a gap in a single station may strongly affect your time series $CC(t)$!
Interested in a bullet-proof time-dependent threshold? Check out the link below: https://ebeauce.github.io/Seismic_BPMF/usage/api/similarity_search.html#BPMF.similarity_search.time_dependent_threshold