Template matching with a single template¶
Templates are selected from Weiqiang Zhu's PhaseNet catalog.
Data requirements¶
Download the seismic data at: https://doi.org/10.5281/zenodo.15097180
Download the PhaseNet earthquake catalog with the following three commands:
- curl -o adloc_events.csv https://storage.googleapis.com/quakeflow_share/Ridgecrest/adloc/adloc_events.csv
- curl -o adloc_picks.csv https://storage.googleapis.com/quakeflow_share/Ridgecrest/adloc/adloc_picks.csv
- curl -o adloc_stations.csv https://storage.googleapis.com/quakeflow_share/Ridgecrest/adloc/adloc_stations.csv
Installing fast_matched_filter
¶
This example uses fast_matched_filter
, a Python wrapper for C and CUDA-C routines. The C and CUDA-C libraries have to be compiled. Read the instructions at https://ebeauce.github.io/FMF_documentation/introduction.html.
import os
import fast_matched_filter as fmf
import glob
import numpy as np
import matplotlib.pyplot as plt
import obspy as obs
import pandas as pd
# path variables and file names
DIR_WAVEFORMS = "/home/ebeauce/SSA_EQ_DETECTION_WORKSHOP/data" # REPLACE THIS PATH WITH WHEREVER YOU DOWNLOADED THE DATA
DIR_CATALOG = "../picks_phasenet/"
STATION_FILE = "adloc_stations.csv"
EVENT_FILE = "adloc_events.csv"
PICK_FILE = "adloc_picks.csv"
Load PhaseNet catalog¶
Here, we read the catalog of the 2019 Ridgecrest sequence made with PhaseNet. Information is divided into three files:
- a station metadata file,
- an event metadata file (the catalog per se),
- a pick database, which contains all the P- and S-wave picks found by PhaseNet.
station_meta = pd.read_csv(os.path.join(DIR_CATALOG, STATION_FILE))
station_meta
network | station | location | instrument | component | latitude | longitude | elevation_m | depth_km | provider | station_id | station_term_time_p | station_term_time_s | station_term_amplitude | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | CI | CCC | NaN | BH | ENZ | 35.524950 | -117.364530 | 670.0 | -0.6700 | SCEDC | CI.CCC..BH | 0.266086 | 0.500831 | 0.049399 |
1 | CI | CCC | NaN | HH | ENZ | 35.524950 | -117.364530 | 670.0 | -0.6700 | SCEDC | CI.CCC..HH | 0.295428 | 0.518465 | 0.191475 |
2 | CI | CCC | NaN | HN | ENZ | 35.524950 | -117.364530 | 670.0 | -0.6700 | SCEDC | CI.CCC..HN | 0.296263 | 0.541148 | 0.064485 |
3 | CI | CLC | NaN | BH | ENZ | 35.815740 | -117.597510 | 775.0 | -0.7750 | SCEDC | CI.CLC..BH | -0.231963 | -0.415271 | -0.331371 |
4 | CI | CLC | NaN | HH | ENZ | 35.815740 | -117.597510 | 775.0 | -0.7750 | SCEDC | CI.CLC..HH | -0.168743 | -0.390045 | -0.140313 |
5 | CI | CLC | NaN | HN | ENZ | 35.815740 | -117.597510 | 775.0 | -0.7750 | SCEDC | CI.CLC..HN | -0.175671 | -0.388116 | -0.249066 |
6 | CI | DTP | NaN | BH | ENZ | 35.267420 | -117.845810 | 908.0 | -0.9080 | SCEDC | CI.DTP..BH | -0.305881 | -0.602459 | -0.503411 |
7 | CI | DTP | NaN | HH | ENZ | 35.267420 | -117.845810 | 908.0 | -0.9080 | SCEDC | CI.DTP..HH | -0.263705 | -0.564867 | -0.437951 |
8 | CI | DTP | NaN | HN | ENZ | 35.267420 | -117.845810 | 908.0 | -0.9080 | SCEDC | CI.DTP..HN | -0.244383 | -0.538990 | -0.500516 |
9 | CI | JRC2 | NaN | BH | ENZ | 35.982490 | -117.808850 | 1469.0 | -1.4690 | SCEDC | CI.JRC2..BH | 0.011361 | -0.080285 | -0.039941 |
10 | CI | JRC2 | NaN | HH | ENZ | 35.982490 | -117.808850 | 1469.0 | -1.4690 | SCEDC | CI.JRC2..HH | 0.053539 | -0.052748 | 0.068213 |
11 | CI | JRC2 | NaN | HN | ENZ | 35.982490 | -117.808850 | 1469.0 | -1.4690 | SCEDC | CI.JRC2..HN | 0.059764 | -0.045991 | -0.007637 |
12 | CI | LRL | NaN | BH | ENZ | 35.479540 | -117.682120 | 1340.0 | -1.3400 | SCEDC | CI.LRL..BH | -0.295604 | -0.540857 | 0.033788 |
13 | CI | LRL | NaN | HH | ENZ | 35.479540 | -117.682120 | 1340.0 | -1.3400 | SCEDC | CI.LRL..HH | -0.268381 | -0.513955 | 0.146876 |
14 | CI | LRL | NaN | HN | ENZ | 35.479540 | -117.682120 | 1340.0 | -1.3400 | SCEDC | CI.LRL..HN | -0.266329 | -0.503088 | 0.045499 |
15 | CI | LRL | 2C | HN | ENZ | 35.479540 | -117.682120 | 1340.0 | -1.3400 | SCEDC | CI.LRL.2C.HN | 0.000000 | 0.000000 | 0.000000 |
16 | CI | MPM | NaN | BH | ENZ | 36.057990 | -117.489010 | 1839.0 | -1.8390 | SCEDC | CI.MPM..BH | -0.011825 | -0.098089 | -0.518793 |
17 | CI | MPM | NaN | HH | ENZ | 36.057990 | -117.489010 | 1839.0 | -1.8390 | SCEDC | CI.MPM..HH | 0.009095 | -0.081896 | -0.459349 |
18 | CI | MPM | NaN | HN | ENZ | 36.057990 | -117.489010 | 1839.0 | -1.8390 | SCEDC | CI.MPM..HN | 0.011870 | -0.052277 | -0.532764 |
19 | CI | Q0072 | 01 | HN | ENZ | 35.609617 | -117.666721 | 695.0 | -0.6950 | SCEDC | CI.Q0072.01.HN | 0.000000 | 0.000000 | 0.000000 |
20 | CI | SLA | NaN | BH | ENZ | 35.890950 | -117.283320 | 1174.0 | -1.1740 | SCEDC | CI.SLA..BH | 0.066893 | 0.118500 | -0.081634 |
21 | CI | SLA | NaN | HH | ENZ | 35.890950 | -117.283320 | 1174.0 | -1.1740 | SCEDC | CI.SLA..HH | 0.089833 | 0.128589 | -0.042928 |
22 | CI | SLA | NaN | HN | ENZ | 35.890950 | -117.283320 | 1174.0 | -1.1740 | SCEDC | CI.SLA..HN | 0.093526 | 0.168238 | -0.150381 |
23 | CI | SRT | NaN | BH | ENZ | 35.692350 | -117.750510 | 667.0 | -0.6670 | SCEDC | CI.SRT..BH | 0.148171 | 0.653411 | -0.253527 |
24 | CI | SRT | NaN | HH | ENZ | 35.692350 | -117.750510 | 667.0 | -0.6670 | SCEDC | CI.SRT..HH | 0.183868 | 0.641707 | -0.208859 |
25 | CI | SRT | NaN | HN | ENZ | 35.692350 | -117.750510 | 667.0 | -0.6670 | SCEDC | CI.SRT..HN | 0.175475 | 0.674587 | -0.295869 |
26 | CI | TOW2 | NaN | BH | ENZ | 35.808560 | -117.764880 | 685.0 | -0.6850 | SCEDC | CI.TOW2..BH | 0.268083 | 0.816279 | 0.028038 |
27 | CI | TOW2 | NaN | HH | ENZ | 35.808560 | -117.764880 | 685.0 | -0.6850 | SCEDC | CI.TOW2..HH | 0.285970 | 0.831533 | 0.102681 |
28 | CI | TOW2 | NaN | HN | ENZ | 35.808560 | -117.764880 | 685.0 | -0.6850 | SCEDC | CI.TOW2..HN | 0.285113 | 0.843886 | -0.006698 |
29 | CI | WBM | NaN | BH | ENZ | 35.608390 | -117.890490 | 892.0 | -0.8920 | SCEDC | CI.WBM..BH | 0.136927 | 0.128744 | -0.166279 |
30 | CI | WBM | NaN | HH | ENZ | 35.608390 | -117.890490 | 892.0 | -0.8920 | SCEDC | CI.WBM..HH | 0.152299 | 0.124250 | -0.138614 |
31 | CI | WBM | NaN | HN | ENZ | 35.608390 | -117.890490 | 892.0 | -0.8920 | SCEDC | CI.WBM..HN | 0.170719 | 0.179461 | -0.101436 |
32 | CI | WBM | 2C | HN | ENZ | 35.608390 | -117.890490 | 892.0 | -0.8920 | SCEDC | CI.WBM.2C.HN | 0.000000 | 0.000000 | 0.000000 |
33 | CI | WCS2 | NaN | BH | ENZ | 36.025210 | -117.765260 | 1143.0 | -1.1430 | SCEDC | CI.WCS2..BH | 0.030349 | -0.126935 | 0.023642 |
34 | CI | WCS2 | NaN | HH | ENZ | 36.025210 | -117.765260 | 1143.0 | -1.1430 | SCEDC | CI.WCS2..HH | 0.065321 | -0.099447 | 0.146628 |
35 | CI | WCS2 | NaN | HN | ENZ | 36.025210 | -117.765260 | 1143.0 | -1.1430 | SCEDC | CI.WCS2..HN | 0.061651 | -0.096450 | 0.037563 |
36 | CI | WMF | NaN | BH | ENZ | 36.117580 | -117.854860 | 1537.4 | -1.5374 | SCEDC | CI.WMF..BH | 0.039416 | -0.005761 | -0.165183 |
37 | CI | WMF | NaN | HH | ENZ | 36.117580 | -117.854860 | 1537.4 | -1.5374 | SCEDC | CI.WMF..HH | 0.075658 | 0.005327 | -0.079900 |
38 | CI | WMF | NaN | HN | ENZ | 36.117580 | -117.854860 | 1537.4 | -1.5374 | SCEDC | CI.WMF..HN | 0.085427 | 0.025273 | -0.240971 |
39 | CI | WMF | 2C | HN | ENZ | 36.117580 | -117.854860 | 1537.4 | -1.5374 | SCEDC | CI.WMF.2C.HN | 0.000000 | 0.000000 | 0.000000 |
40 | CI | WNM | NaN | EH | Z | 35.842200 | -117.906160 | 974.3 | -0.9743 | SCEDC | CI.WNM..EH | -0.026889 | -0.070269 | -0.332510 |
41 | CI | WNM | NaN | HN | ENZ | 35.842200 | -117.906160 | 974.3 | -0.9743 | SCEDC | CI.WNM..HN | -0.011659 | -0.118223 | -0.038719 |
42 | CI | WNM | 2C | HN | ENZ | 35.842200 | -117.906160 | 974.3 | -0.9743 | SCEDC | CI.WNM.2C.HN | 0.000000 | 0.000000 | 0.000000 |
43 | CI | WRC2 | NaN | BH | ENZ | 35.947900 | -117.650380 | 943.0 | -0.9430 | SCEDC | CI.WRC2..BH | -0.023860 | -0.043523 | 0.117833 |
44 | CI | WRC2 | NaN | HH | ENZ | 35.947900 | -117.650380 | 943.0 | -0.9430 | SCEDC | CI.WRC2..HH | 0.010633 | -0.029488 | 0.165234 |
45 | CI | WRC2 | NaN | HN | ENZ | 35.947900 | -117.650380 | 943.0 | -0.9430 | SCEDC | CI.WRC2..HN | 0.014658 | -0.021367 | 0.103905 |
46 | CI | WRV2 | NaN | EH | Z | 36.007740 | -117.890400 | 1070.0 | -1.0700 | SCEDC | CI.WRV2..EH | -0.003461 | -0.154572 | -0.355199 |
47 | CI | WRV2 | NaN | HN | ENZ | 36.007740 | -117.890400 | 1070.0 | -1.0700 | SCEDC | CI.WRV2..HN | 0.017317 | -0.137916 | -0.273270 |
48 | CI | WRV2 | 2C | HN | ENZ | 36.007740 | -117.890400 | 1070.0 | -1.0700 | SCEDC | CI.WRV2.2C.HN | 0.000000 | 0.000000 | 0.000000 |
49 | CI | WVP2 | NaN | EH | Z | 35.949390 | -117.817690 | 1465.0 | -1.4650 | SCEDC | CI.WVP2..EH | 0.020325 | 0.008989 | -0.341606 |
50 | CI | WVP2 | NaN | HN | ENZ | 35.949390 | -117.817690 | 1465.0 | -1.4650 | SCEDC | CI.WVP2..HN | 0.024742 | -0.103024 | -0.089014 |
51 | CI | WVP2 | 2C | HN | ENZ | 35.949390 | -117.817690 | 1465.0 | -1.4650 | SCEDC | CI.WVP2.2C.HN | 0.000000 | 0.000000 | 0.000000 |
The following shows a very rudimentary map of the station network. Look into the cartopy
package for more sophisticated maps.
_station_meta = station_meta.drop_duplicates("station")
fig, ax = plt.subplots(num="station_network", figsize=(10, 10))
ax.scatter(_station_meta["longitude"], _station_meta["latitude"], marker="v", color="k")
for idx, row in _station_meta.iterrows():
ax.text(row.longitude + 0.01, row.latitude + 0.01, row.station, va="bottom", ha="left")
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
ax.grid()
ax.set_title("Stations used to build the PhaseNet catalog")
Text(0.5, 1.0, 'Stations used to build the PhaseNet catalog')
event_meta = pd.read_csv(os.path.join(DIR_CATALOG, EVENT_FILE))
event_meta
time | adloc_score | adloc_residual_time | num_picks | magnitude | adloc_residual_amplitude | event_index | longitude | latitude | depth_km | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 2019-07-04 00:46:47.342963596 | 0.866596 | 0.057355 | 35 | 0.595368 | 0.165480 | 6720 | -117.882570 | 36.091088 | 4.643969 |
1 | 2019-07-04 00:55:32.648412579 | 0.781116 | 0.203567 | 16 | 1.142510 | 0.170509 | 17122 | -117.799226 | 35.378160 | 11.078458 |
2 | 2019-07-04 00:56:37.232733104 | 0.908073 | 0.086183 | 42 | 0.912494 | 0.166681 | 5411 | -117.880902 | 36.091986 | 4.854336 |
3 | 2019-07-04 02:00:39.149363202 | 0.814322 | 0.036164 | 15 | 0.209530 | 0.092534 | 17868 | -117.866468 | 36.093520 | 4.981447 |
4 | 2019-07-04 03:05:31.018885833 | 0.799281 | 0.080708 | 11 | 0.104050 | 0.156833 | 25037 | -117.846320 | 36.100386 | 5.943363 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
20898 | 2019-07-09 23:58:14.298499048 | 0.933013 | 0.097816 | 67 | 1.543257 | 0.131523 | 1907 | -117.711811 | 35.926468 | 6.652020 |
20899 | 2019-07-09 23:58:47.701746285 | 0.882434 | 0.090185 | 73 | 1.087089 | 0.140159 | 2051 | -117.604263 | 35.797450 | 6.617413 |
20900 | 2019-07-09 23:59:05.102247662 | 0.798047 | 0.435077 | 26 | 1.147040 | 0.170994 | 12567 | -117.509729 | 35.692722 | 12.815041 |
20901 | 2019-07-09 23:59:40.257837813 | 0.971081 | 0.065523 | 35 | 1.161323 | 0.068586 | 7726 | -117.846289 | 36.061435 | 5.224666 |
20902 | 2019-07-09 23:59:49.650466544 | 0.800524 | 0.023674 | 15 | 0.936622 | 0.095959 | 18566 | -117.896100 | 36.095886 | 6.761860 |
20903 rows × 10 columns
picks = pd.read_csv(os.path.join(DIR_CATALOG, PICK_FILE))
picks
station_id | phase_index | phase_time | phase_score | phase_type | dt_s | phase_polarity | phase_amplitude | event_index | sp_ratio | event_amplitude | adloc_mask | adloc_residual_time | adloc_residual_amplitude | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | CI.WMF..BH | 280874 | 2019-07-04 00:46:48.759 | 0.594 | P | 0.01 | 0.110 | -3.213107 | 6720 | 0.955091 | 3.124152e-07 | 1 | 0.003919 | 0.012320 |
1 | CI.WMF..HH | 280881 | 2019-07-04 00:46:48.818 | 0.973 | P | 0.01 | 0.938 | -3.077638 | 6720 | 0.948896 | 3.368390e-07 | 1 | 0.026491 | 0.063669 |
2 | CI.WMF..HN | 280881 | 2019-07-04 00:46:48.818 | 0.973 | P | 0.01 | 0.898 | -3.128019 | 6720 | 1.598008 | 2.146277e-07 | 1 | 0.017580 | 0.173702 |
3 | CI.WRV2..EH | 280945 | 2019-07-04 00:46:49.450 | 0.977 | P | 0.01 | -0.855 | -3.464453 | 6720 | 1.000000 | 3.298200e-07 | 1 | 0.077194 | 0.244381 |
4 | CI.WRV2..HN | 280945 | 2019-07-04 00:46:49.450 | 0.941 | P | 0.01 | -0.906 | -3.585863 | 6720 | 1.147235 | 3.908305e-07 | 1 | 0.056694 | 0.041691 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
955700 | CI.WRV2..HN | 8639355 | 2019-07-09 23:59:53.550 | 0.688 | S | 0.01 | -0.022 | -3.384471 | 18566 | 0.644492 | 1.973834e-06 | 1 | 0.019093 | 0.028409 |
955701 | CI.JRC2..HH | 8639499 | 2019-07-09 23:59:54.998 | 0.402 | S | 0.01 | -0.023 | -3.446238 | 18566 | 0.647843 | 6.197256e-06 | 1 | 0.020364 | -0.154404 |
955702 | CI.JRC2..HN | 8639499 | 2019-07-09 23:59:54.998 | 0.344 | S | 0.01 | -0.016 | -3.531800 | 18566 | 0.731796 | 4.075353e-06 | 1 | 0.012773 | -0.164659 |
955703 | CI.WVP2..EH | 8639574 | 2019-07-09 23:59:55.740 | 0.314 | S | 0.01 | -0.103 | -3.488117 | 18566 | 0.636005 | 2.815148e-06 | 1 | -0.092233 | 0.316712 |
955704 | CI.WVP2..HN | 8639576 | 2019-07-09 23:59:55.760 | 0.695 | S | 0.01 | 0.076 | -3.218676 | 18566 | 0.825223 | 4.657352e-06 | 1 | 0.039319 | 0.332722 |
955705 rows × 14 columns
Pick one event¶
Let's use the PhaseNet catalog to read the waveforms of an event.
def fetch_event_waveforms(
event_picks,
dir_waveforms=DIR_WAVEFORMS,
time_before_phase_onset_sec=2.0,
duration_sec=10.0
):
"""
Fetches the waveforms for a given event based on the picks.
Parameters
----------
event_picks : pandas.DataFrame
DataFrame containing the picks for the event.
dir_waveforms : str, optional
Directory where the waveform data is stored, by default DIR_WAVEFORMS.
time_before_phase_onset_sec : float, optional
Time in seconds to start the waveform before the phase onset, by default 2.0.
duration_sec : float, optional
Duration in seconds of the waveform to fetch, by default 10.0.
Returns
-------
obspy.Stream
Stream object containing the fetched waveforms.
"""
stream = obs.Stream()
for _, pick in event_picks.iterrows():
# check whether we have a miniseed file for this waveform
if pick.phase_type == "P":
files = glob.glob(os.path.join(dir_waveforms, pick.station_id + "Z*mseed"))
elif pick.phase_type == "S":
files = glob.glob(os.path.join(dir_waveforms, pick.station_id + "[N,E]*mseed"))
starttime = obs.UTCDateTime(pick.phase_time) - time_before_phase_onset_sec
endtime = starttime + duration_sec
for _file in files:
stream += obs.read(
_file,
starttime=starttime,
endtime=endtime
)
return stream
# explore event_meta to find a nice intermediate-size earthquake we could plot
event_meta.head(20)
time | adloc_score | adloc_residual_time | num_picks | magnitude | adloc_residual_amplitude | event_index | longitude | latitude | depth_km | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 2019-07-04 00:46:47.342963596 | 0.866596 | 0.057355 | 35 | 0.595368 | 0.165480 | 6720 | -117.882570 | 36.091088 | 4.643969 |
1 | 2019-07-04 00:55:32.648412579 | 0.781116 | 0.203567 | 16 | 1.142510 | 0.170509 | 17122 | -117.799226 | 35.378160 | 11.078458 |
2 | 2019-07-04 00:56:37.232733104 | 0.908073 | 0.086183 | 42 | 0.912494 | 0.166681 | 5411 | -117.880902 | 36.091986 | 4.854336 |
3 | 2019-07-04 02:00:39.149363202 | 0.814322 | 0.036164 | 15 | 0.209530 | 0.092534 | 17868 | -117.866468 | 36.093520 | 4.981447 |
4 | 2019-07-04 03:05:31.018885833 | 0.799281 | 0.080708 | 11 | 0.104050 | 0.156833 | 25037 | -117.846320 | 36.100386 | 5.943363 |
5 | 2019-07-04 03:20:28.674438914 | 0.755451 | 0.050436 | 44 | 0.869927 | 0.111885 | 6929 | -117.673684 | 36.114308 | 5.894466 |
6 | 2019-07-04 04:03:01.619369274 | 0.897311 | 0.143631 | 32 | 0.386310 | 0.214381 | 7607 | -117.805077 | 36.016063 | 0.396071 |
7 | 2019-07-04 05:16:47.223353119 | 0.835094 | 0.053522 | 22 | 0.575433 | 0.163675 | 13098 | -117.879256 | 36.090409 | 4.268292 |
8 | 2019-07-04 06:57:32.758991812 | 0.922386 | 0.065846 | 23 | 0.256281 | 0.065212 | 15698 | -117.867587 | 36.081665 | 5.939931 |
9 | 2019-07-04 11:51:07.805591259 | 0.692042 | 0.075023 | 48 | 0.818316 | 0.102087 | 4380 | -117.671909 | 36.118573 | 6.085804 |
10 | 2019-07-04 15:36:04.228420696 | 0.652888 | 0.012137 | 9 | -0.633904 | 0.167058 | 27370 | -117.785299 | 36.005578 | 3.234981 |
11 | 2019-07-04 15:42:47.932558745 | 0.597856 | 0.072740 | 28 | 0.933088 | 0.151623 | 9740 | -117.501116 | 35.707382 | 14.496446 |
12 | 2019-07-04 16:07:20.003321194 | 0.729246 | 0.093399 | 33 | 0.835854 | 0.133106 | 9365 | -117.491503 | 35.711241 | 13.846898 |
13 | 2019-07-04 16:11:46.920083440 | 0.711579 | 0.516367 | 18 | 1.734816 | 0.117952 | 7098 | -117.877675 | 35.196445 | 31.000000 |
14 | 2019-07-04 16:13:43.094792540 | 0.849673 | 0.070048 | 83 | 1.653859 | 0.111108 | 2305 | -117.493716 | 35.710090 | 13.375377 |
15 | 2019-07-04 16:16:07.085486307 | 0.814365 | 0.040161 | 10 | 0.587583 | 0.099619 | 26520 | -117.540198 | 35.689022 | 14.537368 |
16 | 2019-07-04 17:02:55.057058245 | 0.770696 | 0.079675 | 90 | 4.476724 | 0.143489 | 1101 | -117.495094 | 35.711607 | 13.586411 |
17 | 2019-07-04 17:04:02.231614981 | 0.664676 | 0.064038 | 41 | 2.062226 | 0.186214 | 5379 | -117.488446 | 35.711139 | 14.001705 |
18 | 2019-07-04 17:05:05.071421677 | 0.509088 | 0.089882 | 29 | 1.471434 | 0.157048 | 12115 | -117.491307 | 35.710881 | 13.223374 |
19 | 2019-07-04 17:08:51.664841725 | 0.666790 | 0.046444 | 15 | 0.657653 | 0.175270 | 23044 | -117.504506 | 35.706123 | 14.570921 |
# feel free to play with the event index to plot different events
EVENT_IDX = 1101
event_meta.set_index("event_index").loc[EVENT_IDX]
time 2019-07-04 17:02:55.057058245 adloc_score 0.770696 adloc_residual_time 0.079675 num_picks 90 magnitude 4.476724 adloc_residual_amplitude 0.143489 longitude -117.495094 latitude 35.711607 depth_km 13.586411 Name: 1101, dtype: object
# fetch the corresponding picks for this event
event_picks = picks[picks["event_index"] == EVENT_IDX]
event_picks
station_id | phase_index | phase_time | phase_score | phase_type | dt_s | phase_polarity | phase_amplitude | event_index | sp_ratio | event_amplitude | adloc_mask | adloc_residual_time | adloc_residual_amplitude | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
565 | CI.CLC..HN | 6137848 | 2019-07-04 17:02:58.488 | 0.969 | P | 0.01 | -0.879 | -0.511873 | 1101 | 1.017569 | 2.474507e-07 | 1 | -0.010048 | -0.054426 |
566 | CI.CLC..BH | 6137847 | 2019-07-04 17:02:58.489 | 0.633 | P | 0.01 | -0.324 | -0.666956 | 1101 | 1.014822 | 2.022085e-07 | 1 | 0.048153 | -0.127168 |
567 | CI.CLC..HH | 6137849 | 2019-07-04 17:02:58.498 | 0.965 | P | 0.01 | -0.883 | -0.586030 | 1101 | 1.086886 | 2.141392e-07 | 1 | -0.007135 | -0.236796 |
568 | CI.SRT..HN | 6137990 | 2019-07-04 17:02:59.908 | 0.879 | P | 0.01 | 0.699 | -0.643401 | 1101 | 1.377537 | 4.387864e-07 | 1 | -0.069463 | 0.065039 |
569 | CI.SRT..HH | 6137990 | 2019-07-04 17:02:59.908 | 0.891 | P | 0.01 | 0.734 | -0.601019 | 1101 | 0.729745 | 5.910421e-07 | 1 | -0.078144 | 0.020251 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
650 | CI.WMF..HH | 6139185 | 2019-07-04 17:03:11.858 | 0.672 | S | 0.01 | 0.021 | -1.368759 | 1101 | 1.096820 | 3.073536e-07 | 1 | 0.140179 | -0.329117 |
651 | CI.WMF..HN | 6139189 | 2019-07-04 17:03:11.898 | 0.633 | S | 0.01 | 0.026 | -1.371407 | 1101 | 0.989939 | 2.891483e-07 | 1 | 0.159858 | -0.171350 |
652 | CI.DTP..BH | 6139200 | 2019-07-04 17:03:12.019 | 0.734 | S | 0.01 | 0.035 | -1.322484 | 1101 | 1.385395 | 1.624267e-07 | 1 | 0.068940 | 0.175198 |
653 | CI.DTP..HH | 6139203 | 2019-07-04 17:03:12.038 | 0.707 | S | 0.01 | -0.015 | -1.306977 | 1101 | 1.122809 | 3.205840e-07 | 1 | 0.052824 | 0.125587 |
654 | CI.DTP..HN | 6139211 | 2019-07-04 17:03:12.118 | 0.625 | S | 0.01 | 0.043 | -1.282829 | 1101 | 0.863380 | 2.702479e-07 | 1 | 0.105026 | 0.211253 |
90 rows × 14 columns
# fetch the waveforms
event_waveforms = fetch_event_waveforms(event_picks, time_before_phase_onset_sec=10., duration_sec=30.)
print(event_waveforms.__str__(extended=True))
42 Trace(s) in Stream: CI.CLC..HHZ | 2019-07-04T17:02:48.478300Z - 2019-07-04T17:03:18.478300Z | 25.0 Hz, 751 samples CI.SRT..HHZ | 2019-07-04T17:02:49.918300Z - 2019-07-04T17:03:19.918300Z | 25.0 Hz, 751 samples CI.CCC..HHZ | 2019-07-04T17:02:50.118300Z - 2019-07-04T17:03:20.118300Z | 25.0 Hz, 751 samples CI.SLA..HHZ | 2019-07-04T17:02:50.518300Z - 2019-07-04T17:03:20.518300Z | 25.0 Hz, 751 samples CI.TOW2..HHZ | 2019-07-04T17:02:50.518300Z - 2019-07-04T17:03:20.518300Z | 25.0 Hz, 751 samples CI.LRL..HHZ | 2019-07-04T17:02:50.638300Z - 2019-07-04T17:03:20.638300Z | 25.0 Hz, 751 samples CI.WRC2..HHZ | 2019-07-04T17:02:50.718300Z - 2019-07-04T17:03:20.718300Z | 25.0 Hz, 751 samples CI.CLC..HHN | 2019-07-04T17:02:50.958300Z - 2019-07-04T17:03:20.958300Z | 25.0 Hz, 751 samples CI.CLC..HHE | 2019-07-04T17:02:50.958300Z - 2019-07-04T17:03:20.958300Z | 25.0 Hz, 751 samples CI.MPM..HHZ | 2019-07-04T17:02:52.038300Z - 2019-07-04T17:03:22.038300Z | 25.0 Hz, 751 samples CI.WBM..HHZ | 2019-07-04T17:02:52.123100Z - 2019-07-04T17:03:22.123100Z | 25.0 Hz, 751 samples CI.WVP2..EHZ | 2019-07-04T17:02:52.160000Z - 2019-07-04T17:03:22.160000Z | 25.0 Hz, 751 samples CI.WNM..EHZ | 2019-07-04T17:02:52.160000Z - 2019-07-04T17:03:22.160000Z | 25.0 Hz, 751 samples CI.JRC2..HHZ | 2019-07-04T17:02:52.558300Z - 2019-07-04T17:03:22.558300Z | 25.0 Hz, 751 samples CI.WCS2..HHZ | 2019-07-04T17:02:52.598300Z - 2019-07-04T17:03:22.598300Z | 25.0 Hz, 751 samples CI.WRV2..EHZ | 2019-07-04T17:02:53.600000Z - 2019-07-04T17:03:23.600000Z | 25.0 Hz, 751 samples CI.SRT..HHN | 2019-07-04T17:02:53.838300Z - 2019-07-04T17:03:23.838300Z | 25.0 Hz, 751 samples CI.SRT..HHE | 2019-07-04T17:02:53.838300Z - 2019-07-04T17:03:23.838300Z | 25.0 Hz, 751 samples CI.CCC..HHN | 2019-07-04T17:02:54.078300Z - 2019-07-04T17:03:24.078300Z | 25.0 Hz, 751 samples CI.CCC..HHE | 2019-07-04T17:02:54.078300Z - 2019-07-04T17:03:24.078300Z | 25.0 Hz, 751 samples CI.SLA..HHE | 2019-07-04T17:02:54.638300Z - 2019-07-04T17:03:24.638300Z | 25.0 Hz, 751 samples CI.SLA..HHN | 2019-07-04T17:02:54.638300Z - 2019-07-04T17:03:24.638300Z | 25.0 Hz, 751 samples CI.LRL..HHN | 2019-07-04T17:02:54.718300Z - 2019-07-04T17:03:24.718300Z | 25.0 Hz, 751 samples CI.LRL..HHE | 2019-07-04T17:02:54.718300Z - 2019-07-04T17:03:24.718300Z | 25.0 Hz, 751 samples CI.WMF..HHZ | 2019-07-04T17:02:54.798300Z - 2019-07-04T17:03:24.798300Z | 25.0 Hz, 751 samples CI.DTP..HHZ | 2019-07-04T17:02:54.958300Z - 2019-07-04T17:03:24.958300Z | 25.0 Hz, 751 samples CI.TOW2..HHN | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples CI.TOW2..HHE | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples CI.WRC2..HHN | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples CI.WRC2..HHE | 2019-07-04T17:02:54.998300Z - 2019-07-04T17:03:24.998300Z | 25.0 Hz, 751 samples CI.WBM..HHE | 2019-07-04T17:02:57.243100Z - 2019-07-04T17:03:27.243100Z | 25.0 Hz, 751 samples CI.WBM..HHN | 2019-07-04T17:02:57.243100Z - 2019-07-04T17:03:27.243100Z | 25.0 Hz, 751 samples CI.MPM..HHE | 2019-07-04T17:02:57.318300Z - 2019-07-04T17:03:27.318300Z | 25.0 Hz, 751 samples CI.MPM..HHN | 2019-07-04T17:02:57.318300Z - 2019-07-04T17:03:27.318300Z | 25.0 Hz, 751 samples CI.JRC2..HHN | 2019-07-04T17:02:57.958300Z - 2019-07-04T17:03:27.958300Z | 25.0 Hz, 751 samples CI.JRC2..HHE | 2019-07-04T17:02:57.958300Z - 2019-07-04T17:03:27.958300Z | 25.0 Hz, 751 samples CI.WCS2..HHE | 2019-07-04T17:02:58.118300Z - 2019-07-04T17:03:28.118300Z | 25.0 Hz, 751 samples CI.WCS2..HHN | 2019-07-04T17:02:58.118300Z - 2019-07-04T17:03:28.118300Z | 25.0 Hz, 751 samples CI.WMF..HHN | 2019-07-04T17:03:01.838300Z - 2019-07-04T17:03:31.838300Z | 25.0 Hz, 751 samples CI.WMF..HHE | 2019-07-04T17:03:01.838300Z - 2019-07-04T17:03:31.838300Z | 25.0 Hz, 751 samples CI.DTP..HHN | 2019-07-04T17:03:02.038300Z - 2019-07-04T17:03:32.038300Z | 25.0 Hz, 751 samples CI.DTP..HHE | 2019-07-04T17:03:02.038300Z - 2019-07-04T17:03:32.038300Z | 25.0 Hz, 751 samples
# plot them!
fig = event_waveforms.select(component="Z").plot(equal_scale=False)
Run template matching¶
We will now use one of the events from the PhaseNet catalog as a template event to detect events with template matching.
selected_event_meta = event_meta.set_index("event_index").loc[EVENT_IDX]
selected_event_meta
time 2019-07-04 17:02:55.057058245 adloc_score 0.770696 adloc_residual_time 0.079675 num_picks 90 magnitude 4.476724 adloc_residual_amplitude 0.143489 longitude -117.495094 latitude 35.711607 depth_km 13.586411 Name: 1101, dtype: object
selected_event_picks = picks[picks["event_index"] == EVENT_IDX]
selected_event_picks
station_id | phase_index | phase_time | phase_score | phase_type | dt_s | phase_polarity | phase_amplitude | event_index | sp_ratio | event_amplitude | adloc_mask | adloc_residual_time | adloc_residual_amplitude | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
565 | CI.CLC..HN | 6137848 | 2019-07-04 17:02:58.488 | 0.969 | P | 0.01 | -0.879 | -0.511873 | 1101 | 1.017569 | 2.474507e-07 | 1 | -0.010048 | -0.054426 |
566 | CI.CLC..BH | 6137847 | 2019-07-04 17:02:58.489 | 0.633 | P | 0.01 | -0.324 | -0.666956 | 1101 | 1.014822 | 2.022085e-07 | 1 | 0.048153 | -0.127168 |
567 | CI.CLC..HH | 6137849 | 2019-07-04 17:02:58.498 | 0.965 | P | 0.01 | -0.883 | -0.586030 | 1101 | 1.086886 | 2.141392e-07 | 1 | -0.007135 | -0.236796 |
568 | CI.SRT..HN | 6137990 | 2019-07-04 17:02:59.908 | 0.879 | P | 0.01 | 0.699 | -0.643401 | 1101 | 1.377537 | 4.387864e-07 | 1 | -0.069463 | 0.065039 |
569 | CI.SRT..HH | 6137990 | 2019-07-04 17:02:59.908 | 0.891 | P | 0.01 | 0.734 | -0.601019 | 1101 | 0.729745 | 5.910421e-07 | 1 | -0.078144 | 0.020251 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
650 | CI.WMF..HH | 6139185 | 2019-07-04 17:03:11.858 | 0.672 | S | 0.01 | 0.021 | -1.368759 | 1101 | 1.096820 | 3.073536e-07 | 1 | 0.140179 | -0.329117 |
651 | CI.WMF..HN | 6139189 | 2019-07-04 17:03:11.898 | 0.633 | S | 0.01 | 0.026 | -1.371407 | 1101 | 0.989939 | 2.891483e-07 | 1 | 0.159858 | -0.171350 |
652 | CI.DTP..BH | 6139200 | 2019-07-04 17:03:12.019 | 0.734 | S | 0.01 | 0.035 | -1.322484 | 1101 | 1.385395 | 1.624267e-07 | 1 | 0.068940 | 0.175198 |
653 | CI.DTP..HH | 6139203 | 2019-07-04 17:03:12.038 | 0.707 | S | 0.01 | -0.015 | -1.306977 | 1101 | 1.122809 | 3.205840e-07 | 1 | 0.052824 | 0.125587 |
654 | CI.DTP..HN | 6139211 | 2019-07-04 17:03:12.118 | 0.625 | S | 0.01 | 0.043 | -1.282829 | 1101 | 0.863380 | 2.702479e-07 | 1 | 0.105026 | 0.211253 |
90 rows × 14 columns
Read data from same day¶
def fetch_day_waveforms(dir_waveforms):
"""
Fetches the continuous seismograms for a given day.
Parameters
----------
dir_waveforms : str
Directory where the waveform data is stored, by default DIR_WAVEFORMS.
Returns
-------
obspy.Stream
Stream object containing the fetched continuous seismograms.
"""
stream = obs.Stream()
files = glob.glob(os.path.join(dir_waveforms, "*mseed"))
for _file in files:
stream += obs.read(_file)
return stream
# first, read the continuous seismograms into an `obspy.Stream`
continuous_seismograms = fetch_day_waveforms(DIR_WAVEFORMS)
print(continuous_seismograms.__str__(extended=True))
57 Trace(s) in Stream: CI.WCS2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B916..EHZ | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples CI.CLC..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WRV2..EHZ | 2019-07-04T00:00:00.000000Z - 2019-07-04T23:59:59.960000Z | 25.0 Hz, 2160000 samples PB.B916..EH2 | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples PB.B917..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WMF..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B918..EHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B918..EH2 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.TOW2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.SLA..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WRC2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.JRC2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.CCC..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DTP..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.SRT..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.LRL..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.SLA..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.MPM..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DTP..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WBM..HHZ | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples CI.WCS2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.CLC..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B921..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DAW..HHZ | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples CI.CLC..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WMF..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.TOW2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DAW..HHE | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples PB.B918..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WRC2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.JRC2..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.CCC..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B916..EH1 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.MPM..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DTP..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WBM..HHE | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples CI.SRT..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B917..EHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WVP2..EHZ | 2019-07-04T00:00:00.000000Z - 2019-07-04T23:59:59.960000Z | 25.0 Hz, 2160000 samples PB.B917..EH2 | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.LRL..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.CCC..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.SLA..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WNM..EHZ | 2019-07-04T00:00:00.000000Z - 2019-07-04T23:59:59.960000Z | 25.0 Hz, 2160000 samples CI.WRC2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.JRC2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.SRT..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.LRL..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.MPM..HHN | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples PB.B921..EHZ | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples CI.WBM..HHN | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples PB.B921..EH2 | 2019-07-03T23:59:59.998200Z - 2019-07-04T23:59:59.958200Z | 25.0 Hz, 2160000 samples CI.WMF..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.WCS2..HHZ | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.TOW2..HHE | 2019-07-03T23:59:59.998300Z - 2019-07-04T23:59:59.958300Z | 25.0 Hz, 2160000 samples CI.DAW..HHN | 2019-07-04T00:00:00.003100Z - 2019-07-04T23:59:59.963100Z | 25.0 Hz, 2160000 samples
# plot the continuous seismograms from a single station
fig = continuous_seismograms.select(station="CLC").plot()
# then, cast data into `numpy.ndarray`
station_codes = list(set([st.stats.station for st in continuous_seismograms]))
component_codes = ["N", "E", "Z"]
component_aliases={"E": ["E", "2"], "N": ["N", "1"], "Z": ["Z"]}
num_stations = len(station_codes)
num_channels = len(component_codes)
num_samples = len(continuous_seismograms[0].data)
continuous_seismograms_arr = np.zeros((num_stations, num_channels, num_samples), dtype=np.float32)
for s, sta in enumerate(station_codes):
for c, cp in enumerate(component_codes):
for cp_alias in component_aliases[cp]:
sel_seismogram = continuous_seismograms.select(station=sta, component=cp_alias)
if len(sel_seismogram) > 0:
continuous_seismograms_arr[s, c, :] = sel_seismogram[0].data
break
continuous_seismograms_arr
array([[[ 4.09647620e-12, 1.35595988e-11, 1.61010198e-11, ..., 1.87989971e-10, 5.81934279e-10, 1.59266211e-11], [ 8.85055397e-12, 1.41080064e-11, 2.29350306e-12, ..., -1.22290775e-10, 4.84999090e-11, 1.02444719e-10], [-1.44921930e-11, -8.83197541e-13, 5.94794233e-13, ..., 3.61067509e-10, 1.61199512e-10, -2.93843921e-10]], [[-2.02754636e-11, 7.27350949e-12, 2.90683051e-11, ..., -1.27593561e-10, 4.12396610e-11, 2.02138389e-10], [-8.89091231e-12, 1.46336155e-11, 4.67565708e-11, ..., -1.25822366e-10, 5.78710552e-11, 4.38750390e-11], [ 5.92738358e-11, 3.67502244e-11, -1.45799206e-10, ..., -4.12412986e-11, 2.63505495e-10, -1.57162269e-11]], [[ 5.47254013e-11, 7.05789246e-11, -6.25662326e-12, ..., -3.65891401e-10, -7.81559206e-10, -6.38798348e-10], [-1.39255118e-11, 2.02905037e-11, 1.07095530e-11, ..., 1.02497244e-09, 7.77831466e-10, -4.58769023e-10], [ 1.94260233e-10, -3.34868328e-10, -1.75106818e-10, ..., 9.01718644e-10, -9.07585604e-11, -1.75413142e-10]], ..., [[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [-2.38018650e-11, 4.41909790e-12, -4.36377739e-11, ..., -4.77496098e-11, -1.39665293e-10, -1.53408883e-10]], [[ 4.53147459e-12, 2.15497811e-11, 4.97441671e-12, ..., 1.37320554e-11, -8.23028538e-12, 2.90706681e-10], [-2.32118665e-11, -7.40775150e-12, 4.76967381e-12, ..., -5.22246857e-10, -1.66053525e-11, 4.45412207e-10], [-6.16842992e-12, 4.56698828e-12, 1.07985522e-11, ..., 1.94127353e-10, 1.03605340e-10, -3.92332278e-10]], [[ 1.00288475e-11, -1.25283534e-11, -8.96174020e-12, ..., 8.97525543e-12, -6.86870838e-11, -2.98185018e-11], [ 1.08289614e-12, 8.16112889e-12, -1.36335960e-11, ..., 4.46155092e-11, 7.22212498e-11, -3.26238862e-11], [-7.30912639e-12, 1.16460912e-11, 5.44300976e-12, ..., 2.24157377e-11, 3.17579747e-11, -6.67901151e-11]]], dtype=float32)
Build template¶
------------------ Background ------------------¶
A template is a collection of waveforms at different channels, $T_{s,c}(t)$, which are clips taken from the continuous seismograms, $u_{s,c}$. These clips are taken at times defined by: $$ u_{s,c}(t)\ |\ t \in \lbrace \tau_{s,c}; \tau_{s,c} + D \rbrace, $$ where $\tau_{s,c}$ is the start time of the template window and $D$ is the template duration.
$\tau_{s,c}$ is given by some prior information on the event: picks or modeled arrival times. The moveouts, $\tilde{\tau}_{s,c}$, are the collection of delay times relative to the earliest $\tau_{s,c}$: $$ \tilde{\tau}_{s,c} = \tau_{s,c} - \underset{s,c}{\min} \lbrace \tau_{s,c} \rbrace .$$
------------------ On the necessity to clip template waveforms out of numpy.ndarray
instead of obspy.Stream
------------------¶
Looking carefully at the output of print(continuous_seismograms.__str__(extended=True))
, a few cells before, we see that start times are generally not exactly at midnight. This is a consequence of the discrete nature of the continuous seismograms (here, sampled at 25 samples per second). Thus, in general, the $\tau_{s,c}$ computed from picks or modeled arrival times fall in between two samples of the seismograms.
When running a matched-filter search, we need to make sure the moveouts, $\tilde{\tau}_{s,c}$, ultimately expressed in samples, match exactly the times that were used when clipping the template waveforms out of $u_{s,c}$. One way to ensure this is to first cast the $\tau_{s,c}$ to times in samples and then operate exclusively on the numpy.ndarray
: continuous_seismograms_arr
:
$$ T_{s,c}[t_n] = u_{s,c}[\tau_{s,c} + n \Delta t],$$ where $\Delta t$ is the sampling time.
Clip out waveforms and moveout and station-weight arrays¶
# PHASE_ON_COMP: dictionary defining which moveout we use to extract the waveform.
# Here, we use windows centered around the S wave for horizontal components
# and windows starting 1sec before the P wave for the vertical component.
PHASE_ON_COMP = {"N": "S", "E": "S", "Z": "P"}
# OFFSET_PHASE_SEC: dictionary defining the time offset taken before a given phase
# for example OFFSET_PHASE_SEC["P"] = 1.0 means that we extract the window
# 1 second before the predicted P arrival time
OFFSET_PHASE_SEC = {"P": 1.0, "S": 4.0}
# TEMPLATE_DURATION_SEC
TEMPLATE_DURATION_SEC = 8.
# SAMPLING_RATE_HZ
SAMPLING_RATE_HZ = 25.
# TEMPLATE_DURATION_SAMP
TEMPLATE_DURATION_SAMP = int(TEMPLATE_DURATION_SEC * SAMPLING_RATE_HZ)
# add station_code columns to `selected_event_picks`
selected_event_picks.set_index("station_id", inplace=True)
for staid in selected_event_picks.index:
station_code = staid.split(".")[1]
selected_event_picks.loc[staid, "station_code"] = station_code
selected_event_picks
/tmp/ipykernel_471469/1881751651.py:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy selected_event_picks.loc[staid, "station_code"] = station_code
phase_index | phase_time | phase_score | phase_type | dt_s | phase_polarity | phase_amplitude | event_index | sp_ratio | event_amplitude | adloc_mask | adloc_residual_time | adloc_residual_amplitude | station_code | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
station_id | ||||||||||||||
CI.CLC..HN | 6137848 | 2019-07-04 17:02:58.488 | 0.969 | P | 0.01 | -0.879 | -0.511873 | 1101 | 1.017569 | 2.474507e-07 | 1 | -0.010048 | -0.054426 | CLC |
CI.CLC..BH | 6137847 | 2019-07-04 17:02:58.489 | 0.633 | P | 0.01 | -0.324 | -0.666956 | 1101 | 1.014822 | 2.022085e-07 | 1 | 0.048153 | -0.127168 | CLC |
CI.CLC..HH | 6137849 | 2019-07-04 17:02:58.498 | 0.965 | P | 0.01 | -0.883 | -0.586030 | 1101 | 1.086886 | 2.141392e-07 | 1 | -0.007135 | -0.236796 | CLC |
CI.SRT..HN | 6137990 | 2019-07-04 17:02:59.908 | 0.879 | P | 0.01 | 0.699 | -0.643401 | 1101 | 1.377537 | 4.387864e-07 | 1 | -0.069463 | 0.065039 | SRT |
CI.SRT..HH | 6137990 | 2019-07-04 17:02:59.908 | 0.891 | P | 0.01 | 0.734 | -0.601019 | 1101 | 0.729745 | 5.910421e-07 | 1 | -0.078144 | 0.020251 | SRT |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
CI.WMF..HH | 6139185 | 2019-07-04 17:03:11.858 | 0.672 | S | 0.01 | 0.021 | -1.368759 | 1101 | 1.096820 | 3.073536e-07 | 1 | 0.140179 | -0.329117 | WMF |
CI.WMF..HN | 6139189 | 2019-07-04 17:03:11.898 | 0.633 | S | 0.01 | 0.026 | -1.371407 | 1101 | 0.989939 | 2.891483e-07 | 1 | 0.159858 | -0.171350 | WMF |
CI.DTP..BH | 6139200 | 2019-07-04 17:03:12.019 | 0.734 | S | 0.01 | 0.035 | -1.322484 | 1101 | 1.385395 | 1.624267e-07 | 1 | 0.068940 | 0.175198 | DTP |
CI.DTP..HH | 6139203 | 2019-07-04 17:03:12.038 | 0.707 | S | 0.01 | -0.015 | -1.306977 | 1101 | 1.122809 | 3.205840e-07 | 1 | 0.052824 | 0.125587 | DTP |
CI.DTP..HN | 6139211 | 2019-07-04 17:03:12.118 | 0.625 | S | 0.01 | 0.043 | -1.282829 | 1101 | 0.863380 | 2.702479e-07 | 1 | 0.105026 | 0.211253 | DTP |
90 rows × 14 columns
In the following cell, we build the numpy.ndarray
of moveouts $\tilde{\tau}_{s,c}$, expressed in units of samples.
# first, we extract the set of relative delay times of the beginning of each
# template window on a given station and component
tau_s_c_sec = np.zeros((num_stations, num_channels), dtype=np.float64)
for s, sta in enumerate(station_codes):
for c, cp in enumerate(component_codes):
phase_type = PHASE_ON_COMP[cp]
picks_s_c = selected_event_picks[
(
(selected_event_picks["station_code"] == sta)
& (selected_event_picks["phase_type"] == phase_type)
)
]
if len(picks_s_c) == 0:
# no pick for this station/component: set to -999
tau_s_c_sec[s, c] = -999
elif len(picks_s_c) == 1:
# express pick relative to beginning of day (midnight)
_pick = pd.Timestamp(picks_s_c["phase_time"])
_relative_pick_sec = (_pick - pd.Timestamp(_pick.strftime("%Y-%m-%d"))).total_seconds()
tau_s_c_sec[s, c] = _relative_pick_sec - OFFSET_PHASE_SEC[phase_type]
else:
# there were several picks from different channels: average them
_relative_pick_sec = 0.
for _pick in picks_s_c["phase_time"].values:
_pick = pd.Timestamp(_pick)
_relative_pick_sec += (_pick - pd.Timestamp(_pick.strftime("%Y-%m-%d"))).total_seconds()
_relative_pick_sec /= float(len(picks_s_c["phase_time"]))
tau_s_c_sec[s, c] = _relative_pick_sec - OFFSET_PHASE_SEC[phase_type]
# now, we convert these relative times into samples
# and express them relative to the earliest time
# we also store in memory the minimum time offset `tau_min_samp` for the next step
moveouts_samp_arr = (tau_s_c_sec * SAMPLING_RATE_HZ).astype(np.int64)
tau_min_samp = np.min(moveouts_samp_arr[moveouts_samp_arr > 0])
moveouts_samp_arr = moveouts_samp_arr - tau_min_samp
moveouts_samp_arr
array([[-1559399, -1559399, -1559399], [-1559399, -1559399, -1559399], [ 156, 156, 104], [ 78, 78, 53], [-1559399, -1559399, -1559399], [ 225, 225, 140], [-1559399, -1559399, -1559399], [ 101, 101, 68], [ 272, 272, 171], [ 277, 277, 174], [ 91, 91, 62], [ 179, 179, 116], [ 160, 160, 105], [ 0, 0, 13], [ 175, 175, 115], [ 102, 102, 64], [ 94, 94, 66], [ 72, 72, 48], [ 165, 165, 105], [ 159, 159, 102], [-1559399, -1559399, -1559399]])
Next, we use the moveouts, in samples, to clip out the relevant template waveforms from the continuous seismograms.
template_waveforms_arr = np.zeros((num_stations, num_channels, TEMPLATE_DURATION_SAMP), dtype=np.float32)
weights_arr = np.ones((num_stations, num_channels), dtype=np.float32)
for s, sta in enumerate(station_codes):
for c, cp in enumerate(component_codes):
if moveouts_samp_arr[s, c] < 0:
# no picks were found on this station
weights_arr[s, c] = 0.
continue
starttime = tau_min_samp + moveouts_samp_arr[s, c]
endtime = starttime + TEMPLATE_DURATION_SAMP
template_waveforms_arr[s, c, :] = continuous_seismograms_arr[s, c, starttime:endtime]
if template_waveforms_arr[s, c, :].sum() == 0.:
# no data was available on this channel
weights_arr[s, c] = 0.
template_waveforms_arr
array([[[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]], [[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]], [[-1.15102615e-04, -2.08503159e-04, -8.57295672e-05, ..., -3.02342505e-05, 1.90505059e-04, 2.32697013e-04], [-6.95939161e-06, -1.14591476e-05, -4.32817069e-05, ..., 1.95618981e-04, -6.08503397e-05, -1.64061450e-04], [-3.15401110e-08, 6.19310697e-07, 1.24642884e-06, ..., -6.75416959e-05, 9.95227219e-06, 9.39235106e-05]], ..., [[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 6.11326456e-08, 1.84645728e-07, 2.68150814e-07, ..., -1.40672455e-05, 9.12318046e-06, 1.10272435e-06]], [[ 9.79182732e-05, 1.20548120e-05, -6.78718134e-05, ..., -4.39741780e-05, -8.56743718e-05, -4.07844163e-05], [-1.06544954e-04, -2.09494847e-05, 3.22320338e-05, ..., -2.05596989e-05, -2.35213724e-06, 1.43820762e-05], [ 7.50264846e-08, -1.39887931e-07, -4.09550466e-07, ..., 2.37395070e-05, -6.56054617e-05, 1.63327525e-06]], [[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]]], dtype=float32)
fig, axes = plt.subplots(num="template_waveforms", nrows=num_stations, ncols=num_channels, figsize=(16, 20), sharex=True)
for s, sta in enumerate(station_codes):
for c, cp in enumerate(component_codes):
time_sec = (max(0., moveouts_samp_arr[s, c]) + np.arange(TEMPLATE_DURATION_SAMP)) / SAMPLING_RATE_HZ
axes[s, c].plot(
time_sec,
template_waveforms_arr[s, c, :],
color="k",
lw=0.75
)
axes[s, c].text(0.02, 0.96, sta + " " + cp, ha="left", va="top", transform=axes[s, c].transAxes)
# axes[s, c].set_title(sta + " " + cp)
if s == num_stations - 1:
axes[s, c].set_xlabel("Time (s)")
if c == 0:
axes[s, c].set_ylabel("Amp.")
axes[s, c].grid()
# plt.tight_layout()
# normalize template waveforms for numerical reasons
norm = np.std(template_waveforms_arr, axis=-1, keepdims=True)
norm[norm == 0.] = 1.
template_waveforms_arr /= norm
# normalize weights so that they sum up to one
weights_arr /= np.sum(weights_arr)
# normalize continuous seismograms for numerical reasons
norm = np.std(continuous_seismograms_arr, axis=-1, keepdims=True)
norm[norm == 0.] = 1.
continuous_seismograms_arr /= norm
Run FMF¶
After all this data formatting, we can now run template matching (also called matched-filtering) to detect new events that are similar to our template event.
For that, use the software Fast Matched Filter (FMF): https://github.com/beridel/fast_matched_filter
FMF offers C and CUDA-C routines to efficiently run template matching on CPUs, or even on GPUs if available to you.
# FMF_STEP_SAMP: this is the step between two consecutive calculation of the correlation coefficient
FMF_STEP_SAMP = 1
# ARCH: it determines whether you want to use GPUs or CPUs
# If you do not have an Nvidia GPU, set ARCH = "cpu"
ARCH = "gpu"
The following cell computes the time series of correlation coefficients between the template waveforms, $T_{s,c}$, and the continuous seismograms, $u_{s,c}$: $$ CC(t) = \sum_{s,c} w_{s,c} \sum_{i=1}^N \dfrac{T^*_{s,c}(n \Delta t) u^*_{s,c}(t + \tilde{\tau}_{s,c} + n \Delta t)}{\sqrt{\sum_{i=1}^N {T^*_{s,c}}^2(n \Delta t) \sum_{i=1}^N {u^*_{s,c}}^2(t + \tilde{\tau}_{s,c} + n \Delta t)}}, $$ with:
- $T^*_{s,c} = T_{s,c} - \frac{1}{N} \sum_{i=1}^N T_{s,c}(n \Delta t)$,
- $u^*_{s,c}(t) = u_{s,c}(t) - \frac{1}{N} \sum_{i=1}^N u_{s,c}(t + n \Delta t)$.
Note that because the seismograms were filtered below periods that are shorter than the template window ($N \Delta t$) we have $T^*_{s,c} \approx T_{s,c}$ and $u^*_{s,c} \approx u_{s,c}$, which spares us the computation of the mean in each sliding window.
cc = fmf.matched_filter(
template_waveforms_arr.astype(np.float32),
moveouts_samp_arr.astype(np.int32),
weights_arr.astype(np.float32),
continuous_seismograms_arr.astype(np.float32),
FMF_STEP_SAMP,
arch=ARCH,
)
# FMF is programmed to handle multiple templates at once. Here, we only used
# a single template, hence the size of the outermost axis of "1"
cc.shape
(1, 2159801)
# let's print the output of our template matching run, which a time series of network-averaged correlation coefficients
# of same duration as the continuous seismograms
_cc = cc[0, :]
time_cc = np.arange(len(_cc)) / SAMPLING_RATE_HZ
fig = plt.figure("network_averaged_cc", figsize=(20, 6))
gs = fig.add_gridspec(ncols=4)
ax1 = fig.add_subplot(gs[:3])
ax1.plot(time_cc, _cc, lw=0.75)
ax1.set_xlabel("Elapsed time (sec)")
ax1.set_ylabel("Network-averaged cc")
ax1.set_xlim(time_cc.min(), time_cc.max())
ax1.set_title("Time series of template/continuous seismograms similarity")
ax2 = fig.add_subplot(gs[3], sharey=ax1)
_ = ax2.hist(_cc, orientation="horizontal", bins=250, density=True)
ax2.set_xlabel("Normalized count")
ax2.set_title("Histogram")
for ax in [ax1, ax2]:
ax.grid()
Set detection threshold and find events¶
We will use the time series of correlation coefficients to build an earthquake catalog. For that, we need to set a detection threshold and define all times above that threshold as triggers caused by near-repeats of the template event.
def select_cc_indexes(
cc_t,
threshold,
search_win,
):
"""Select the peaks in the CC time series.
Parameters
------------
cc_t: (n_corr,) numpy.ndarray
The CC time series for one template.
threshold: (n_corr,) numpy.ndarray or scalar
The detection threshold.
search_win: scalar int
The minimum inter-event time, in units of correlation step.
Returns
--------
cc_idx: (n_detections,) numpy.ndarray
The list of all selected CC indexes. They give the timings of the
detected events.
"""
cc_detections = cc_t > threshold
cc_idx = np.where(cc_detections)[0]
cc_idx = list(cc_idx)
n_rm = 0
for i in range(1, len(cc_idx)):
if (cc_idx[i - n_rm] - cc_idx[i - n_rm - 1]) < search_win:
if cc_t[cc_idx[i - n_rm]] > cc_t[cc_idx[i - n_rm - 1]]:
# keep (i-n_rm)-th detection
cc_idx.remove(cc_idx[i - n_rm - 1])
else:
# keep (i-n_rm-1)-th detection
cc_idx.remove(cc_idx[i - n_rm])
n_rm += 1
cc_idx = np.asarray(cc_idx)
return cc_idx
# INTEREVENT_TIME_RESOLUTION_SEC: In some cases, a template might trigger multiple, closely spaced detections because
# of a phenomenon similar to that of "cycle skipping", where the waveform correlates
# well with a time-shifted version of itself. Thus, to avoid redundant detections, we
# set a minimum time separation between triggers (rule of thumb: about half the template duration)
INTEREVENT_TIME_RESOLUTION_SEC = 5.
INTEREVENT_TIME_RESOLUTION_SAMP = int(INTEREVENT_TIME_RESOLUTION_SEC * SAMPLING_RATE_HZ)
_cc = cc[0, :]
time_cc = np.arange(len(_cc)) * FMF_STEP_SAMP / SAMPLING_RATE_HZ
NUM_RMS = 8.
detection_threshold = NUM_RMS * np.std(_cc)
event_cc_indexes = select_cc_indexes(_cc, detection_threshold, INTEREVENT_TIME_RESOLUTION_SAMP)
fig = plt.figure("network_averaged_cc_w_threshold", figsize=(20, 6))
gs = fig.add_gridspec(ncols=4)
ax1 = fig.add_subplot(gs[:3])
ax1.plot(time_cc, _cc, lw=0.75)
ax1.scatter(time_cc[event_cc_indexes], _cc[event_cc_indexes], linewidths=0.25, edgecolor="k", color="r", zorder=2)
ax1.set_xlabel("Elapsed time (sec)")
ax1.set_ylabel("Network-averaged cc")
ax1.set_xlim(time_cc.min(), time_cc.max())
ax1.set_title("Time series of template/continuous seismograms similarity")
ax2 = fig.add_subplot(gs[3], sharey=ax1)
_ = ax2.hist(_cc, orientation="horizontal", bins=250, density=True, zorder=2)
ax2.set_xlabel("Normalized count")
ax2.set_title("Histogram")
label = f"Detection threshold: {NUM_RMS:.0f}"r"$\times \mathrm{RMS}(\mathrm{CC}(t))$"f"\n{len(event_cc_indexes):d} detected events"
for ax in [ax1, ax2]:
ax.grid()
ax.axhline(
detection_threshold, ls="--", color="r",
label=label
)
ax1.legend(loc="upper left")
<matplotlib.legend.Legend at 0x7fadd03341c0>
Use the trigger times to build the earthquake catalog and extract event waveforms on a given station/component.
STATION_NAME = "CLC"
COMPONENT_NAME = "Z"
date = pd.Timestamp(
(continuous_seismograms[0].stats.starttime.timestamp + continuous_seismograms[0].stats.endtime.timestamp) / 2.,
unit="s"
).strftime("%Y-%m-%d")
sta_idx = station_codes.index(STATION_NAME)
cp_idx = component_codes.index(COMPONENT_NAME)
catalog = {
"detection_time": [],
"peak_amplitude": [],
"cc": [],
"normalized_cc": []
}
detected_event_waveforms = []
for i in range(len(event_cc_indexes)):
idx_start = event_cc_indexes[i] * FMF_STEP_SAMP + moveouts_samp_arr[sta_idx, cp_idx]
idx_end = idx_start + TEMPLATE_DURATION_SAMP
detected_event_waveforms.append(continuous_seismograms_arr[sta_idx, cp_idx, idx_start:idx_end])
# --------------------------------------
detection_time = pd.Timestamp(date) + pd.Timedelta(event_cc_indexes[i] * FMF_STEP_SAMP / SAMPLING_RATE_HZ, "s")
cc = _cc[event_cc_indexes[i]]
normalized_cc = cc / detection_threshold
peak_amplitude = np.abs(detected_event_waveforms[-1]).max()
catalog["detection_time"].append(detection_time)
catalog["peak_amplitude"].append(peak_amplitude)
catalog["cc"].append(cc)
catalog["normalized_cc"].append(normalized_cc)
detected_event_waveforms = np.asarray(detected_event_waveforms)
catalog = pd.DataFrame(catalog)
catalog
detection_time | peak_amplitude | cc | normalized_cc | |
---|---|---|---|---|
0 | 2019-07-04 15:42:49.840 | 0.003044 | 0.167660 | 1.426171 |
1 | 2019-07-04 16:07:21.880 | 0.002178 | 0.162098 | 1.378858 |
2 | 2019-07-04 16:13:44.960 | 0.035267 | 0.252016 | 2.143727 |
3 | 2019-07-04 17:02:56.960 | 16.214684 | 1.000000 | 8.506305 |
4 | 2019-07-04 17:09:21.680 | 0.155795 | 0.339568 | 2.888470 |
5 | 2019-07-04 17:11:41.320 | 0.006684 | 0.120572 | 1.025621 |
6 | 2019-07-04 17:12:16.520 | 0.116193 | 0.181086 | 1.540373 |
7 | 2019-07-04 17:12:39.680 | 0.030746 | 0.150769 | 1.282485 |
8 | 2019-07-04 17:13:28.560 | 0.001898 | 0.156879 | 1.334461 |
9 | 2019-07-04 17:13:53.480 | 0.003748 | 0.125633 | 1.068674 |
10 | 2019-07-04 17:15:48.640 | 0.019283 | 0.254803 | 2.167428 |
11 | 2019-07-04 17:16:20.080 | 0.002100 | 0.200094 | 1.702063 |
12 | 2019-07-04 17:17:43.000 | 0.000707 | 0.123891 | 1.053855 |
13 | 2019-07-04 17:18:09.040 | 0.001339 | 0.118229 | 1.005688 |
14 | 2019-07-04 17:20:56.360 | 0.000434 | 0.154932 | 1.317895 |
15 | 2019-07-04 17:27:36.680 | 0.002425 | 0.143366 | 1.219511 |
16 | 2019-07-04 17:32:54.160 | 0.003028 | 0.174316 | 1.482784 |
17 | 2019-07-04 17:37:28.920 | 9.444638 | 0.129424 | 1.100916 |
18 | 2019-07-04 17:39:37.680 | 3.430590 | 0.118089 | 1.004503 |
19 | 2019-07-04 18:10:10.960 | 0.157595 | 0.127386 | 1.083580 |
20 | 2019-07-04 18:19:09.040 | 0.277436 | 0.121896 | 1.036887 |
21 | 2019-07-04 19:03:40.560 | 0.265189 | 0.148633 | 1.264321 |
22 | 2019-07-04 20:09:45.160 | 0.116027 | 0.146313 | 1.244580 |
23 | 2019-07-04 20:55:00.480 | 0.023483 | 0.179276 | 1.524972 |
24 | 2019-07-04 21:20:53.760 | 0.014506 | 0.133311 | 1.133981 |
25 | 2019-07-04 21:46:58.000 | 0.026888 | 0.132058 | 1.123324 |
26 | 2019-07-04 21:50:55.440 | 0.042805 | 0.216717 | 1.843457 |
27 | 2019-07-04 23:38:45.680 | 0.012385 | 0.130068 | 1.106398 |
Plot some waveforms¶
When building a catalog, it is always necessary to visualize some of the detected event waveforms to get a sense of the ratio of true-to-false detection rate.
In the following, we plot the waveforms of each detected event and we also compare the stack of all the waveforms to the original template waveform. Since all events share similar waveforms, the stack is similar to the template waveform. Moreover, since noise across all these waveforms sums up incoherently, stacking acts as a denoiser which may help you produce a cleaner version of the template waveform, for example on remote stations.
fig = plt.figure("detected_event_waveforms", figsize=(10, 15))
gs = fig.add_gridspec(nrows=4)
ax1 = fig.add_subplot(gs[:3])
_time_wav = np.arange(detected_event_waveforms.shape[1]) / SAMPLING_RATE_HZ
stack = np.zeros(detected_event_waveforms.shape[1])
for i in range(detected_event_waveforms.shape[0]):
norm = np.abs(detected_event_waveforms[i, :]).max()
if catalog["cc"].iloc[i] > 0.999:
color = "r"
template_wav = detected_event_waveforms[i, :] / norm
else:
color = "k"
time_of_day = catalog["detection_time"].iloc[i].strftime("%H:%M:%S")
ax1.plot(_time_wav, detected_event_waveforms[i, :] / norm + i * 1.5, color=color)
ax1.text(0.98 * _time_wav.max(), i * 1.5 + 0.1, time_of_day, ha="right", va="bottom")
stack += detected_event_waveforms[i, :] / norm
stack /= np.abs(stack).max()
ax1.set_xlabel("Time (s)")
ax1.set_xlim(_time_wav.min(), _time_wav.max())
ax1.set_ylabel("Normalized offset amplitude")
ax1.set_title(f"Events detected on {date} and recorded by {STATION_NAME}.{COMPONENT_NAME}")
ax2 = fig.add_subplot(gs[3], sharex=ax1)
ax2.plot(_time_wav, stack, color="blue", label="Stacked waveforms")
ax2.plot(_time_wav, template_wav, color="red", ls="--", label="Template waveform")
ax2.legend(loc="upper left")
ax2.set_xlabel("Time (s)")
ax2.set_ylabel("Normalized amplitude")
Text(0, 0.5, 'Normalized amplitude')
Plot template return time vs detection time¶
catalog["return_time_s"] = catalog["detection_time"].diff().dt.total_seconds()
catalog
detection_time | peak_amplitude | cc | normalized_cc | return_time_s | |
---|---|---|---|---|---|
0 | 2019-07-04 15:42:49.840 | 0.003044 | 0.167660 | 1.426171 | NaN |
1 | 2019-07-04 16:07:21.880 | 0.002178 | 0.162098 | 1.378858 | 1472.04 |
2 | 2019-07-04 16:13:44.960 | 0.035267 | 0.252016 | 2.143727 | 383.08 |
3 | 2019-07-04 17:02:56.960 | 16.214684 | 1.000000 | 8.506305 | 2952.00 |
4 | 2019-07-04 17:09:21.680 | 0.155795 | 0.339568 | 2.888470 | 384.72 |
5 | 2019-07-04 17:11:41.320 | 0.006684 | 0.120572 | 1.025621 | 139.64 |
6 | 2019-07-04 17:12:16.520 | 0.116193 | 0.181086 | 1.540373 | 35.20 |
7 | 2019-07-04 17:12:39.680 | 0.030746 | 0.150769 | 1.282485 | 23.16 |
8 | 2019-07-04 17:13:28.560 | 0.001898 | 0.156879 | 1.334461 | 48.88 |
9 | 2019-07-04 17:13:53.480 | 0.003748 | 0.125633 | 1.068674 | 24.92 |
10 | 2019-07-04 17:15:48.640 | 0.019283 | 0.254803 | 2.167428 | 115.16 |
11 | 2019-07-04 17:16:20.080 | 0.002100 | 0.200094 | 1.702063 | 31.44 |
12 | 2019-07-04 17:17:43.000 | 0.000707 | 0.123891 | 1.053855 | 82.92 |
13 | 2019-07-04 17:18:09.040 | 0.001339 | 0.118229 | 1.005688 | 26.04 |
14 | 2019-07-04 17:20:56.360 | 0.000434 | 0.154932 | 1.317895 | 167.32 |
15 | 2019-07-04 17:27:36.680 | 0.002425 | 0.143366 | 1.219511 | 400.32 |
16 | 2019-07-04 17:32:54.160 | 0.003028 | 0.174316 | 1.482784 | 317.48 |
17 | 2019-07-04 17:37:28.920 | 9.444638 | 0.129424 | 1.100916 | 274.76 |
18 | 2019-07-04 17:39:37.680 | 3.430590 | 0.118089 | 1.004503 | 128.76 |
19 | 2019-07-04 18:10:10.960 | 0.157595 | 0.127386 | 1.083580 | 1833.28 |
20 | 2019-07-04 18:19:09.040 | 0.277436 | 0.121896 | 1.036887 | 538.08 |
21 | 2019-07-04 19:03:40.560 | 0.265189 | 0.148633 | 1.264321 | 2671.52 |
22 | 2019-07-04 20:09:45.160 | 0.116027 | 0.146313 | 1.244580 | 3964.60 |
23 | 2019-07-04 20:55:00.480 | 0.023483 | 0.179276 | 1.524972 | 2715.32 |
24 | 2019-07-04 21:20:53.760 | 0.014506 | 0.133311 | 1.133981 | 1553.28 |
25 | 2019-07-04 21:46:58.000 | 0.026888 | 0.132058 | 1.123324 | 1564.24 |
26 | 2019-07-04 21:50:55.440 | 0.042805 | 0.216717 | 1.843457 | 237.44 |
27 | 2019-07-04 23:38:45.680 | 0.012385 | 0.130068 | 1.106398 | 6470.24 |
fig, ax = plt.subplots(num="return_time_vs_detection_time", figsize=(16, 7))
ax.scatter(catalog["detection_time"], catalog["return_time_s"], c=catalog["cc"], linewidths=0.25, edgecolor="k", cmap="magma", s=50)
ax.set_xlabel("Detection time")
ax.set_ylabel("Return time (s)")
ax.set_title("Return time vs detection time")
ax.grid()
cbar = plt.colorbar(ax.collections[0], ax=ax)
cbar.set_label("CC")
ax.set_yscale("log")
Relative magnitude¶
Template matching lends itself well to quickly estimate relative event magnitudes, relative to the magnitude of the template event.
$$ M_r = M_{\mathrm{ref}} + \dfrac{1}{N} \sum_{ch=1}^{N} \log \dfrac{A_{ch}}{A_{\mathrm{ref},ch}} $$
where the sum is taken over $N$ channels and $A_{ch}$ is the peak amplitude measured on channel $ch$ while $A_{\mathrm{ref},ch}$ is the peak amplitude of the reference event (the template) measured on the same channel.
M_REF = selected_event_meta.magnitude
We can use the peak amplitudes that we read before.
template_index = np.where(catalog["cc"] > 0.999)[0][0]
m_rel = M_REF + np.log10(catalog["peak_amplitude"] / catalog["peak_amplitude"].iloc[template_index])
catalog["m_rel_one_channel"] = m_rel
catalog
detection_time | peak_amplitude | cc | normalized_cc | return_time_s | m_rel_one_channel | |
---|---|---|---|---|---|---|
0 | 2019-07-04 15:42:49.840 | 0.003044 | 0.167660 | 1.426171 | NaN | 0.750315 |
1 | 2019-07-04 16:07:21.880 | 0.002178 | 0.162098 | 1.378858 | 1472.04 | 0.604785 |
2 | 2019-07-04 16:13:44.960 | 0.035267 | 0.252016 | 2.143727 | 383.08 | 1.814178 |
3 | 2019-07-04 17:02:56.960 | 16.214684 | 1.000000 | 8.506305 | 2952.00 | 4.476724 |
4 | 2019-07-04 17:09:21.680 | 0.155795 | 0.339568 | 2.888470 | 384.72 | 2.459368 |
5 | 2019-07-04 17:11:41.320 | 0.006684 | 0.120572 | 1.025621 | 139.64 | 1.091826 |
6 | 2019-07-04 17:12:16.520 | 0.116193 | 0.181086 | 1.540373 | 35.20 | 2.331995 |
7 | 2019-07-04 17:12:39.680 | 0.030746 | 0.150769 | 1.282485 | 23.16 | 1.754599 |
8 | 2019-07-04 17:13:28.560 | 0.001898 | 0.156879 | 1.334461 | 48.88 | 0.545083 |
9 | 2019-07-04 17:13:53.480 | 0.003748 | 0.125633 | 1.068674 | 24.92 | 0.840592 |
10 | 2019-07-04 17:15:48.640 | 0.019283 | 0.254803 | 2.167428 | 115.16 | 1.551985 |
11 | 2019-07-04 17:16:20.080 | 0.002100 | 0.200094 | 1.702063 | 31.44 | 0.588959 |
12 | 2019-07-04 17:17:43.000 | 0.000707 | 0.123891 | 1.053855 | 82.92 | 0.116397 |
13 | 2019-07-04 17:18:09.040 | 0.001339 | 0.118229 | 1.005688 | 26.04 | 0.393615 |
14 | 2019-07-04 17:20:56.360 | 0.000434 | 0.154932 | 1.317895 | 167.32 | -0.096164 |
15 | 2019-07-04 17:27:36.680 | 0.002425 | 0.143366 | 1.219511 | 400.32 | 0.651551 |
16 | 2019-07-04 17:32:54.160 | 0.003028 | 0.174316 | 1.482784 | 317.48 | 0.747959 |
17 | 2019-07-04 17:37:28.920 | 9.444638 | 0.129424 | 1.100916 | 274.76 | 4.242001 |
18 | 2019-07-04 17:39:37.680 | 3.430590 | 0.118089 | 1.004503 | 128.76 | 3.802184 |
19 | 2019-07-04 18:10:10.960 | 0.157595 | 0.127386 | 1.083580 | 1833.28 | 2.464358 |
20 | 2019-07-04 18:19:09.040 | 0.277436 | 0.121896 | 1.036887 | 538.08 | 2.709977 |
21 | 2019-07-04 19:03:40.560 | 0.265189 | 0.148633 | 1.264321 | 2671.52 | 2.690371 |
22 | 2019-07-04 20:09:45.160 | 0.116027 | 0.146313 | 1.244580 | 3964.60 | 2.331373 |
23 | 2019-07-04 20:55:00.480 | 0.023483 | 0.179276 | 1.524972 | 2715.32 | 1.637578 |
24 | 2019-07-04 21:20:53.760 | 0.014506 | 0.133311 | 1.133981 | 1553.28 | 1.428355 |
25 | 2019-07-04 21:46:58.000 | 0.026888 | 0.132058 | 1.123324 | 1564.24 | 1.696379 |
26 | 2019-07-04 21:50:55.440 | 0.042805 | 0.216717 | 1.843457 | 237.44 | 1.898305 |
27 | 2019-07-04 23:38:45.680 | 0.012385 | 0.130068 | 1.106398 | 6470.24 | 1.359715 |
However, it is much better to average the log ratios over multiple channels.
# first, extract clips from the continuous seismograms
detected_event_waveforms_all_channels = np.zeros(
(len(catalog), num_stations, num_channels, TEMPLATE_DURATION_SAMP), dtype=np.float32
)
for i in range(len(catalog)):
for s in range(num_stations):
for c in range(num_channels):
idx_start = event_cc_indexes[i] * FMF_STEP_SAMP + moveouts_samp_arr[sta_idx, cp_idx]
idx_end = idx_start + TEMPLATE_DURATION_SAMP
detected_event_waveforms_all_channels[i, s, c, :] = (
continuous_seismograms_arr[s, c, idx_start:idx_end]
)
# then, measure the peak amplitudes
amplitude_ratios = (
np.max(detected_event_waveforms_all_channels, axis=-1)
/ np.max(detected_event_waveforms_all_channels[template_index, ...], axis=-1)[None, ...]
)
# if some data were missing, we may have divided by 0 and nans or infs
# mask them and ignore them in the following calculation
invalid = (np.isnan(amplitude_ratios) | np.isinf(amplitude_ratios))
amplitude_ratios = np.ma.masked_where(invalid, amplitude_ratios)
# apply the above formula to get the relative magnitudes
m_rel_all_channels = M_REF + np.ma.mean(np.log10(amplitude_ratios), axis=(1, 2))
catalog["m_rel_all_channels"] = m_rel_all_channels
catalog
/tmp/ipykernel_471469/457280950.py:3: RuntimeWarning: invalid value encountered in true_divide np.max(detected_event_waveforms_all_channels, axis=-1)
detection_time | peak_amplitude | cc | normalized_cc | return_time_s | m_rel_one_channel | m_rel_all_channels | |
---|---|---|---|---|---|---|---|
0 | 2019-07-04 15:42:49.840 | 0.003044 | 0.167660 | 1.426171 | NaN | 0.750315 | 0.948512 |
1 | 2019-07-04 16:07:21.880 | 0.002178 | 0.162098 | 1.378858 | 1472.04 | 0.604785 | 0.914793 |
2 | 2019-07-04 16:13:44.960 | 0.035267 | 0.252016 | 2.143727 | 383.08 | 1.814178 | 1.781098 |
3 | 2019-07-04 17:02:56.960 | 16.214684 | 1.000000 | 8.506305 | 2952.00 | 4.476724 | 4.476724 |
4 | 2019-07-04 17:09:21.680 | 0.155795 | 0.339568 | 2.888470 | 384.72 | 2.459368 | 2.556656 |
5 | 2019-07-04 17:11:41.320 | 0.006684 | 0.120572 | 1.025621 | 139.64 | 1.091826 | 1.440504 |
6 | 2019-07-04 17:12:16.520 | 0.116193 | 0.181086 | 1.540373 | 35.20 | 2.331995 | 2.389888 |
7 | 2019-07-04 17:12:39.680 | 0.030746 | 0.150769 | 1.282485 | 23.16 | 1.754599 | 2.043036 |
8 | 2019-07-04 17:13:28.560 | 0.001898 | 0.156879 | 1.334461 | 48.88 | 0.545083 | 1.010687 |
9 | 2019-07-04 17:13:53.480 | 0.003748 | 0.125633 | 1.068674 | 24.92 | 0.840592 | 1.131839 |
10 | 2019-07-04 17:15:48.640 | 0.019283 | 0.254803 | 2.167428 | 115.16 | 1.551985 | 1.764089 |
11 | 2019-07-04 17:16:20.080 | 0.002100 | 0.200094 | 1.702063 | 31.44 | 0.588959 | 1.078424 |
12 | 2019-07-04 17:17:43.000 | 0.000707 | 0.123891 | 1.053855 | 82.92 | 0.116397 | 0.674097 |
13 | 2019-07-04 17:18:09.040 | 0.001339 | 0.118229 | 1.005688 | 26.04 | 0.393615 | 0.755659 |
14 | 2019-07-04 17:20:56.360 | 0.000434 | 0.154932 | 1.317895 | 167.32 | -0.096164 | 0.674497 |
15 | 2019-07-04 17:27:36.680 | 0.002425 | 0.143366 | 1.219511 | 400.32 | 0.651551 | 1.081399 |
16 | 2019-07-04 17:32:54.160 | 0.003028 | 0.174316 | 1.482784 | 317.48 | 0.747959 | 1.004731 |
17 | 2019-07-04 17:37:28.920 | 9.444638 | 0.129424 | 1.100916 | 274.76 | 4.242001 | 4.492626 |
18 | 2019-07-04 17:39:37.680 | 3.430590 | 0.118089 | 1.004503 | 128.76 | 3.802184 | 4.082750 |
19 | 2019-07-04 18:10:10.960 | 0.157595 | 0.127386 | 1.083580 | 1833.28 | 2.464358 | 2.506171 |
20 | 2019-07-04 18:19:09.040 | 0.277436 | 0.121896 | 1.036887 | 538.08 | 2.709977 | 3.136539 |
21 | 2019-07-04 19:03:40.560 | 0.265189 | 0.148633 | 1.264321 | 2671.52 | 2.690371 | 2.455167 |
22 | 2019-07-04 20:09:45.160 | 0.116027 | 0.146313 | 1.244580 | 3964.60 | 2.331373 | 2.180167 |
23 | 2019-07-04 20:55:00.480 | 0.023483 | 0.179276 | 1.524972 | 2715.32 | 1.637578 | 1.975095 |
24 | 2019-07-04 21:20:53.760 | 0.014506 | 0.133311 | 1.133981 | 1553.28 | 1.428355 | 1.859795 |
25 | 2019-07-04 21:46:58.000 | 0.026888 | 0.132058 | 1.123324 | 1564.24 | 1.696379 | 2.244630 |
26 | 2019-07-04 21:50:55.440 | 0.042805 | 0.216717 | 1.843457 | 237.44 | 1.898305 | 2.152389 |
27 | 2019-07-04 23:38:45.680 | 0.012385 | 0.130068 | 1.106398 | 6470.24 | 1.359715 | 1.358247 |
Next, we plot the distribution of earthquake magnitudes. The cumulative distribution usually follows the so-called Gutenberg-Richter law: $$ \log N(m \geq M) = a - b M$$ In this example, the number of events is too low to estimate a meaningful b-value.
fig, ax = plt.subplots(num="magnitude_distribution", figsize=(8, 8))
axb = ax.twinx()
ax.set_title("Distribution of earthquake magnitudes")
_ = ax.hist(catalog["m_rel_all_channels"], bins=15, color="k", alpha=0.5)
axb.plot(np.sort(catalog["m_rel_all_channels"]), np.arange(len(catalog))[::-1], color="k", lw=2)
ax.set_xlabel("Magnitude")
ax.set_ylabel("Event Count")
axb.set_ylabel("Cumulative Event Count")
for ax in [ax, axb]:
ax.set_yscale("log")