.. role:: skyblue .. role:: red === m66 === A time series data points are anomalous if the 6th median is 6 standard deviations (six-sigma) from the time series 6th median standard deviation and persists for x_windows, where `x_windows = int(window / 2)`. This algorithm finds **SIGNIFICANT** changepoints in a time series, similar to PELT and Bayesian Online Changepoint Detection, however it is more robust to instantaneous outliers and more conditionally selective of changepoints. See the docstrings - https://earthgecko-skyline.readthedocs.io/en/latest/skyline.custom_algorithms.html#module-custom_algorithms.m66 See the custom_algorithm source - https://github.com/earthgecko/skyline/blob/master/skyline/custom_algorithms/m66.py m66 TCPDBench results ===================== https://github.com/alan-turing-institute/TCPDBench Seeing as there is a changepoint algorithm benchmarking app might as well test it. It would probably score low seeing as it is not detecting **all** changepoints by design, only significant changepoints. But it scores but than expected, it is places 6th overall. **Heatmap** .. figure:: ../images/m66/m66.with.TCPDBench.results.heatmap.greens.png :alt: m66 TCPDBench results heatmap **Best highlighted** .. figure:: ../images/m66/m66.with.TCPDBench.results.highlight.blue.png :alt: m66 TCPDBench results best highlighted heatmap **Top and bottom 3** .. figure:: ../images/m66/m66.with.TCPDBench.results.top.bottom.3.png :alt: m66 TCPDBench results top and bottom 3 heatmap **Results and rank** .. image:: ../images/m66/m66.with.TCPDBench.results.heatmap.rank.rank.png :alt: m66 TCPDBench results rank heatmap Testing m66 with TCPDBench ========================== "If you want to climb the mountain, you most do all the hard things and climb it." -- Kukuczka Apart from some lacking docs and deps bugs with TCPDBench, got there in the end. Build datasets on CentOS 8 -------------------------- Set up TCPD and TCPDBench on a CentOS 8 server https://github.com/alan-turing-institute/TCPD https://github.com/alan-turing-institute/TCPDBench .. code-block:: bash yum install texlive yum install latekmk PYTHON_VERSION="3.8.11" PYTHON_MAJOR_VERSION="3.8" PYTHON_VIRTUALENV_DIR="/opt/python_virtualenv" PROJECT="TCPDBench-py3811" cd "${PYTHON_VIRTUALENV_DIR}/projects" virtualenv --python="${PYTHON_VIRTUALENV_DIR}/versions/${PYTHON_VERSION}/bin/python${PYTHON_MAJOR_VERSION}" "$PROJECT" cd /opt/python_virtualenv/projects/$PROJECT/ source bin/activate # dataset # As per https://github.com/alan-turing-institute/TCPD#using-the-command-line git clone https://github.com/alan-turing-institute/TCPD cd TCPD /opt/python_virtualenv/projects/$PROJECT/bin/"pip${PYTHON_MAJOR_VERSION}" install -r requirements.txt Installing collected packages: six, urllib3, pytz, python-dateutil, numpy, idna, charset-normalizer, certifi, soupsieve, requests, regex, pyrsistent, pandas, multitasking, lxml, et-xmlfile, chardet, attrs, yfinance, Pillow, openpyxl, jsonschema, diff-match-patch, clevercsv, beautifulsoup4 Successfully installed Pillow-8.3.1 attrs-21.2.0 beautifulsoup4-4.9.3 certifi-2021.5.30 chardet-4.0.0 charset-normalizer-2.0.4 clevercsv-0.7.0 diff-match-patch-20200713 et-xmlfile-1.1.0 idna-3.2 jsonschema-3.2.0 lxml-4.6.3 multitasking-0.0.9 numpy-1.21.1 openpyxl-3.0.7 pandas-1.3.1 pyrsistent-0.18.0 python-dateutil-2.8.2 pytz-2021.1 regex-2021.8.3 requests-2.26.0 six-1.16.0 soupsieve-2.2.1 urllib3-1.26.6 yfinance-0.1.63 (TCPDBench-py3811) [root@server TCPD] /opt/python_virtualenv/projects/$PROJECT/bin/"pip${PYTHON_MAJOR_VERSION}" list (TCPDBench-py3811) [root@server TCPD] /opt/python_virtualenv/projects/$PROJECT/bin/"pip${PYTHON_MAJOR_VERSION}" list Package Version ------------------ --------- attrs 21.2.0 beautifulsoup4 4.9.3 certifi 2021.5.30 chardet 4.0.0 charset-normalizer 2.0.4 clevercsv 0.7.0 diff-match-patch 20200713 et-xmlfile 1.1.0 idna 3.2 jsonschema 3.2.0 lxml 4.6.3 multitasking 0.0.9 numpy 1.21.1 openpyxl 3.0.7 pandas 1.3.1 Pillow 8.3.1 pip 21.2.4 pyrsistent 0.18.0 python-dateutil 2.8.2 pytz 2021.1 regex 2021.8.3 requests 2.26.0 setuptools 57.4.0 six 1.16.0 soupsieve 2.2.1 urllib3 1.26.6 wheel 0.37.0 yfinance 0.1.63 (TCPDBench-py3811) [root@server TCPD] Build datasets .. code-block:: bash /opt/python_virtualenv/projects/$PROJECT/bin/"python${PYTHON_MAJOR_VERSION}" build_tcpd.py -v collect (TCPDBench-py3811) [root@server TCPD] /opt/python_virtualenv/projects/$PROJECT/bin/"python${PYTHON_MAJOR_VERSION}" build_tcpd.py -v collect Running collect action for dataset: apple ... ok Running collect action for dataset: bee_waggle_6 ... ok Running collect action for dataset: bitcoin ... ok Running collect action for dataset: iceland_tourism ... ok Running collect action for dataset: measles ... ok Running collect action for dataset: occupancy ... ok Running collect action for dataset: ratner_stock ... ok Running collect action for dataset: robocalls ... ok Running collect action for dataset: scanline_126007 ... ok Running collect action for dataset: scanline_42049 ... ok (TCPDBench-py3811) [root@server TCPD] Check datasets .. code-block:: bash /opt/python_virtualenv/projects/$PROJECT/bin/"python${PYTHON_MAJOR_VERSION}" ./utils/check_checksums.py -v -c ./checksums.json -d ./datasets (TCPDBench-py3811) [root@server TCPD] /opt/python_virtualenv/projects/$PROJECT/bin/"python${PYTHON_MAJOR_VERSION}" ./utils/check_checksums.py -v -c ./checksums.json -d ./datasets Checking apple.json Checking bank.json Checking bee_waggle_6.json Checking bitcoin.json Checking brent_spot.json Checking businv.json Checking centralia.json Checking children_per_woman.json Checking co2_canada.json Checking construction.json Checking debt_ireland.json Checking gdp_argentina.json Checking gdp_croatia.json Checking gdp_iran.json Checking gdp_japan.json Checking global_co2.json Checking homeruns.json Checking iceland_tourism.json Checking jfk_passengers.json Checking lga_passengers.json Checking measles.json Checking nile.json Checking occupancy.json Checking ozone.json Checking quality_control_1.json Checking quality_control_2.json Checking quality_control_3.json Checking quality_control_4.json Checking quality_control_5.json Checking rail_lines.json Checking ratner_stock.json Checking robocalls.json Checking run_log.json Checking scanline_126007.json Checking scanline_42049.json Checking seatbelts.json Checking shanghai_license.json Checking uk_coal_employ.json Checking unemployment_nl.json Checking us_population.json Checking usd_isk.json Checking well_log.json All ok. (TCPDBench-py3811) [root@server TCPD] Set up TCPDBench .. code-block:: bash cd /opt/python_virtualenv/projects/$PROJECT/ source bin/activate # As per https://github.com/alan-turing-institute/TCPDBench#getting-started git clone --recurse-submodules https://github.com/alan-turing-institute/TCPDBench cd TCPDBench /opt/python_virtualenv/projects/$PROJECT/bin/"pip${PYTHON_MAJOR_VERSION}" install -r ./analysis/requirements.txt Installing collected packages: sortedcontainers, intervaltree, termcolor, tabulate, scipy, labella, colorama Successfully installed colorama-0.4.4 intervaltree-3.1.0 labella-0.9.8 scipy-1.7.1 sortedcontainers-2.4.0 tabulate-0.8.9 termcolor-1.1.0 (TCPDBench-py3811) [root@server TCPDBench] Install m66 requirement .. code-block:: bash cd /opt/python_virtualenv/projects/$PROJECT/ bin/"pip${PYTHON_MAJOR_VERSION}" install bottleneck (TCPDBench-py3811) [root@server TCPDBench-py3811] bin/"pip${PYTHON_MAJOR_VERSION}" install bottleneck Collecting bottleneck Using cached Bottleneck-1.3.2-cp38-cp38-linux_x86_64.whl Requirement already satisfied: numpy in ./lib/python3.8/site-packages (from bottleneck) (1.21.1) Installing collected packages: bottleneck Successfully installed bottleneck-1.3.2 (TCPDBench-py3811) [root@server TCPDBench-py3811] Errors on CentOS 8 Latex / texlive error. And seemingly no texlive-standalone package on CentOS 8 :( .. code-block:: bash cd /opt/python_virtualenv/projects/$PROJECT/TCPDBench make results ... ... python ./analysis/scripts/rank_plots.py -i analysis/output/tables/best_cover_uni_full.json -o analysis/output/rankplots/rankplot_best_cover_uni.tex -b max --type best Warning: Filtering out RBOCPDMS due to insufficient results. Latexmk: This is Latexmk, John Collins, 18 June 2019, version: 4.65. Rule 'pdflatex': The following rules & subrules became out-of-date: 'pdflatex' ------------ Run number 1 of rule 'pdflatex' ------------ ------------ Running 'pdflatex --interaction=nonstopmode -recorder -output-directory="/tmp/tmp16cu66k5" "/tmp/tmp16cu66k5/labella_text.tex"' ------------ Latexmk: applying rule 'pdflatex'... This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/tmp/tmp16cu66k5/labella_text.tex LaTeX2e <2017-04-15> Babel <3.17> and hyphenation patterns for 3 language(s) loaded. ! LaTeX Error: File `standalone.cls' not found. Type X to quit or to proceed, or enter new name. (Default extension: cls) Enter file name: ! Emergency stop. l.3 \begin {document}^^M ! ==> Fatal error occurred, no output PDF file produced! Transcript written on /tmp/tmp16cu66k5/labella_text.log. Latexmk: Missing input file: 'standalone.cls' from line '! LaTeX Error: File `standalone.cls' not found.' Collected error summary (may duplicate other messages): pdflatex: Command for 'pdflatex' gave return code 1 Refer to '/tmp/tmp16cu66k5/labella_text.log' for details Latexmk: Use the -f option to force complete processing, unless error was exceeding maximum runs, or warnings treated as errors. === TeX engine is 'pdfTeX' Latexmk: Errors, so I did not complete making targets Traceback (most recent call last): File "./analysis/scripts/rank_plots.py", line 151, in main() File "./analysis/scripts/rank_plots.py", line 145, in main make_rank_plot( File "./analysis/scripts/rank_plots.py", line 90, in make_rank_plot tl = TimelineTex(plot_data, options=options) File "/opt/python_virtualenv/projects/TCPDBench-py3811/lib/python3.8/site-packages/labella/timeline.py", line 509, in __init__ super().__init__(items, options=options, output_mode="tex") File "/opt/python_virtualenv/projects/TCPDBench-py3811/lib/python3.8/site-packages/labella/timeline.py", line 145, in __init__ self.items = self.parse_items(dicts, output_mode=output_mode) File "/opt/python_virtualenv/projects/TCPDBench-py3811/lib/python3.8/site-packages/labella/timeline.py", line 179, in parse_items it = Item( File "/opt/python_virtualenv/projects/TCPDBench-py3811/lib/python3.8/site-packages/labella/timeline.py", line 97, in __init__ self.width, self.height = self.get_text_dimensions() File "/opt/python_virtualenv/projects/TCPDBench-py3811/lib/python3.8/site-packages/labella/timeline.py", line 107, in get_text_dimensions width, height = text_dimensions( File "/opt/python_virtualenv/projects/TCPDBench-py3811/lib/python3.8/site-packages/labella/tex.py", line 139, in text_dimensions width, height = get_latex_dims(tex, silent=silent, File "/opt/python_virtualenv/projects/TCPDBench-py3811/lib/python3.8/site-packages/labella/tex.py", line 106, in get_latex_dims compile_latex(fname, tmpdirname, latexmk_options, silent=silent) File "/opt/python_virtualenv/projects/TCPDBench-py3811/lib/python3.8/site-packages/labella/tex.py", line 94, in compile_latex raise (e) File "/opt/python_virtualenv/projects/TCPDBench-py3811/lib/python3.8/site-packages/labella/tex.py", line 89, in compile_latex output = subprocess.check_output(command, stderr=subprocess.STDOUT) File "/opt/python_virtualenv/versions/3.8.11/lib/python3.8/subprocess.py", line 415, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/opt/python_virtualenv/versions/3.8.11/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['latexmk', '--pdf', '--outdir=/tmp/tmp16cu66k5', '--interaction=nonstopmode', '/tmp/tmp16cu66k5/labella_text.tex']' returned non-zero exit status 12. make: *** [Makefile:226: analysis/output/rankplots/rankplot_best_cover_uni.tex] Error 1 (TCPDBench-py3811) [root@server TCPDBench] Needs latex so needs to be done local. Trying POCing on Ubuntu 14.04 laptop as CentOS 8 has no texlive-standalone .. code-block:: bash PYTHON_VERSION="3.8.6" PYTHON_MAJOR_VERSION="3.8" PYTHON_VIRTUALENV_DIR="/opt/python_virtualenv" PROJECT="TCPDBench-py386" cd "${PYTHON_VIRTUALENV_DIR}/projects" virtualenv --python="${PYTHON_VIRTUALENV_DIR}/versions/${PYTHON_VERSION}/bin/python${PYTHON_MAJOR_VERSION}" "$PROJECT" cd /opt/python_virtualenv/projects/$PROJECT/ source bin/activate # dataset # As per https://github.com/alan-turing-institute/TCPD#using-the-command-line git clone https://github.com/alan-turing-institute/TCPD cd TCPD /opt/python_virtualenv/projects/$PROJECT/bin/"pip${PYTHON_MAJOR_VERSION}" install -r requirements.txt # Self signed ssl cert errors /opt/python_virtualenv/projects/$PROJECT/bin/"python${PYTHON_MAJOR_VERSION}" build_tcpd.py -v collect # (TCPDBench-py386) gary@mc11:/opt/python_virtualenv/projects/TCPDBench-py386/TCPD$ /opt/python_virtualenv/projects/$PROJECT/bin/"python${PYTHON_MAJOR_VERSION}" build_tcpd.py -v collect # Running collect action for dataset: apple ... ok # Running collect action for dataset: bee_waggle_6 ... Error occurred (URLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1124)'))) when trying to download zip. Retrying in 5 seconds <_io.TextIOWrapper name='' mode='w' encoding='utf-8'> # Error occurred (URLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1124) '))) when trying to download zip. Retrying in 5 seconds <_io.TextIOWrapper name='' mode='w' encoding='utf-8'> # Error occurred (URLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1124) '))) when trying to download zip. Retrying in 5 seconds <_io.TextIOWrapper name='' mode='w' encoding='utf-8'> Due to openssl version on Ubuntu 14.04 Use dataset that where built on CentOS 8 .. code-block:: bash # rsync datasets from server rsync -avz --exclude __pycache__/ -e 'ssh -o "StrictHostKeyChecking=no" -i -l root -ax -o ClearAllForwardings=yes' server:/opt/python_virtualenv/projects/TCPDBench-py3811/TCPD/datasets/ /opt/python_virtualenv/projects/TCPDBench-py386/TCPD/datasets/ # verify OK (TCPDBench-py386) gary@mc11:/opt/python_virtualenv/projects/TCPDBench-py386/TCPD$ /opt/python_virtualenv/projects/$PROJECT/bin/"python${PYTHON_MAJOR_VERSION}" ./utils/check_checksums.py -v -c ./checksums.json -d ./datasets Checking apple.json Checking bank.json Checking bee_waggle_6.json Checking bitcoin.json Checking brent_spot.json Checking businv.json Checking centralia.json Checking children_per_woman.json Checking co2_canada.json Checking construction.json Checking debt_ireland.json Checking gdp_argentina.json Checking gdp_croatia.json Checking gdp_iran.json Checking gdp_japan.json Checking global_co2.json Checking homeruns.json Checking iceland_tourism.json Checking jfk_passengers.json Checking lga_passengers.json Checking measles.json Checking nile.json Checking occupancy.json Checking ozone.json Checking quality_control_1.json Checking quality_control_2.json Checking quality_control_3.json Checking quality_control_4.json Checking quality_control_5.json Checking rail_lines.json Checking ratner_stock.json Checking robocalls.json Checking run_log.json Checking scanline_126007.json Checking scanline_42049.json Checking seatbelts.json Checking shanghai_license.json Checking uk_coal_employ.json Checking unemployment_nl.json Checking us_population.json Checking usd_isk.json Checking well_log.json All ok. (TCPDBench-py386) gary@mc11:/opt/python_virtualenv/projects/TCPDBench-py386/TCPD$ Run on Ubuntu machine .. code-block:: bash cd /opt/python_virtualenv/projects/TCPDBench-py386/ # As per https://github.com/alan-turing-institute/TCPDBench#getting-started git clone --recurse-submodules https://github.com/alan-turing-institute/TCPDBench cd TCPDBench /opt/python_virtualenv/projects/$PROJECT/bin/"pip${PYTHON_MAJOR_VERSION}" install -r ./analysis/requirements.txt # Install m66 requirement cd /opt/python_virtualenv/projects/$PROJECT/ bin/"pip${PYTHON_MAJOR_VERSION}" install bottleneck cd /opt/python_virtualenv/projects/$PROJECT/TCPDBench make results # OK apart from one Copy datasets on Ubuntu machine .. code-block:: bash # First, obtain the Turing Change Point Dataset and follow the instructions # provided there. Copy the dataset files to a datasets directory in this # repository mkdir /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/datasets rsync -avz /opt/python_virtualenv/projects/TCPDBench-py386/TCPD/datasets/ /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/datasets/ Move exisitng abed_results .. code-block:: bash cd /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench mv abed_results old_abed_results (TCPDBench-py386) gary@mc11:/opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench$ mv abed_results old_abed_results Install abed with error .. code-block:: bash cd /opt/python_virtualenv/projects/TCPDBench-py386 bin/pip3.8 install abed # ERROR ... gcc -pthread _configtest.o -lvt.mpi -o _configtest /usr/bin/ld: cannot find -lvt.mpi collect2: error: ld returned 1 exit status failure. ... _configtest.c:2:17: fatal error: mpi.h: No such file or directory #include ^ compilation terminated. failure. removing: _configtest.c _configtest.o error: Cannot compile MPI programs. Check your configuration!!! ---------------------------------------- ERROR: Failed building wheel for mpi4py Failed to build mpi4py ERROR: Could not build wheels for mpi4py which use PEP 517 and cannot be installed directly (TCPDBench-py386) Much googling, considered docker route, no easier unless you forked and cloned your repo with Dockerfile ... Reading mpi4py docs .. code-block:: bash sudo apt-get install openmpi-bin openmpi-doc libopenmpi-dev Success .. code-block:: bash cd /opt/python_virtualenv/projects/TCPDBench-py386 bin/pip3.8 install abed ... Successfully installed Fabric3-1.14.post1 abed-0.1.2 backports.lzma-0.0.14 bcrypt-3.2.0 bz2file-0.98 cryptography-3.4.7 dominate-2.6.0 gitdb-4.0.7 gitpython-3.1.18 mpi4py-3.1.0 paramiko-2.7.2 progressbar-2.5 pynacl-1.4.0 smmap-4.0.0 tqdm-4.62.0 (TCPDBench-py386) gary@mc11:/opt/python_virtualenv/projects/TCPDBench-py386$ jupyter. Let us see this data to fit m66 to work with it. .. code-block:: bash cd /opt/python_virtualenv/projects/TCPDBench-py386 source bin/activate bin/pip3.8 install jupyter jupyter notebook & https://github.com/earthgecko/skyline/blob/v5.0.0-alpha/tests/20210814.POC.task4236.test.m66.with.TCPDBench.ipynb.py Got it to run eventually and analysed just m66 and zero and got results but cannot plot them with make results. Latex breaks Perhaps to update you need to run all .. code-block:: bash (TCPDBench-py386) gary@mc11:/opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench$ abed status There are 41664 tasks left to be done, out of 41664 tasks defined. (TCPDBench-py386) gary@mc11:/opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench$ And ... > You may want to run these experiments in parallel on a large number of cores, > as the expected runtime is on the order of 21 days on a single core. Once this > command starts running the experiments you will see result files appear in the > staging directory. https://github.com/alan-turing-institute/AnnotateChange See if we can plot what the others scored at least. .. code-block:: bash cd /opt/python_virtualenv/projects/TCPDBench-py386 source bin/activate git clone https://github.com/alan-turing-institute/AnnotateChange cd AnnotateChange /opt/python_virtualenv/projects/TCPDBench-py386/bin/pip3.8 install -r requirements.txt cp .env.example .env.development sed -i 's/DB_TYPE=mysql/DB_TYPE=sqlite3/g' .env.development ./flask.sh db upgrade ./flask.sh admin add --auto-confirm-email A bit flaky, looks like it emails the annotations to the user. So from annotations it looks like they had users ['6', '7', '8', '9', '10', '12', '13', '14'] Did the annotations and not all of them did all the timeseries. I can see you they used AnnotateChange to do that as well. m66 is a LEGIT changepoint detection algorithm! .. code-block:: bash 1516 rm -rf analysis/output/ 1517 # I think this is the first time it has run properly with best 1518 make -k results 1519 rm -rf analysis/output 1520 rsync -az --exclude best_m66/ --exclude default_m66/ /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/abed_results.original/ /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/abed_results/ 1521 make -k results 1522 rm -rf analysis/output/ 1523 make -k results 1524 rm -rf analysis/output/ 1525 make -k results 1526 rm -rf stagedir/0 1527 abed reload_tasks 1528 mpiexec -np 2 abed local 1529 rsync -az --exclude best_m66/ --exclude default_m66/ /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/abed_results.original/ /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/abed_results/ 1530 make -k results 1531 rm -rf abed_results/ 1532 rm -rf analysis/output/ 1533 abed reload_tasks 1534 mpiexec -np 2 abed local 1535 rsync -az --exclude best_m66/ --exclude default_m66/ /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/abed_results.original/ /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/abed_results/ 1536 make -k results 1537 rm -rf abed_results/ 1538 rm -rf analysis/output/ 1539 abed reload_tasks 1540 mpiexec -np 2 abed local 1541 rsync -az --exclude best_m66/ --exclude default_m66/ /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/abed_results.original/ /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/abed_results/ 1542 make -k results 1543 ../bin/pip3.8 install pdflatex 1544 ../bin/pip3.8 install xhtml2pdf 1545 history | tail -n 50 (TCPDBench-py386) gary@mc11:/opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench$ date Sun Aug 15 21:28:11 BST 2021 (TCPDBench-py386) gary@mc11:/opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench$ results ------- .. code-block:: python """ best_cover_avg m66 0.795455 bocpd 0.789368 segneigh 0.783592 binseg 0.780245 amoc 0.745989 bocpdms 0.743791 pelt 0.725448 ecp 0.720113 kcpa 0.625981 zero 0.578767 prophet 0.576222 cpnp 0.552106 wbs 0.428036 rfpop 0.414154 best_f1_avg bocpd 0.879675 binseg 0.855901 segneigh 0.854902 m66 0.841738 amoc 0.798912 ecp 0.796599 pelt 0.787211 kcpa 0.683167 cpnp 0.665980 zero 0.662375 bocpdms 0.620328 prophet 0.534355 wbs 0.532729 rfpop 0.530942 default_cover_avg binseg 0.705799 amoc 0.701605 pelt 0.688798 segneigh 0.676410 bocpd 0.636019 bocpdms 0.633351 rbocpdms 0.628623 zero 0.582727 m66 0.582623 prophet 0.539869 cpnp 0.535341 ecp 0.522748 rfpop 0.392427 wbs 0.330206 kcpa 0.061955 default_f1_avg binseg 0.744400 pelt 0.709992 amoc 0.703711 bocpd 0.689622 segneigh 0.675545 zero 0.668967 m66 0.648660 cpnp 0.606694 ecp 0.597710 bocpdms 0.507137 rfpop 0.499453 prophet 0.487742 rbocpdms 0.446714 wbs 0.411706 kcpa 0.111007 """ Test params in their algorithms make venv numpy before scipy But make venv throw error and numpy needs to be installed before scipy. .. code-block:: bash Building wheels for collected packages: scipy Building wheel for scipy (setup.py) ... error ERROR: Command errored out with exit status 1: command: /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/rbocpdms/venv/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-qro7kaiy/scipy_d9e7a8590f554c5c851b4993f896117b/setup.py'"'"'; __file__='"'"'/tmp/pip-install-qro7kaiy/scipy_d9e7a8590f554c5c851b4993f896117b/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-xdsqoeuc cwd: /tmp/pip-install-qro7kaiy/scipy_d9e7a8590f554c5c851b4993f896117b/ Complete output (9 lines): /tmp/pip-install-qro7kaiy/scipy_d9e7a8590f554c5c851b4993f896117b/setup.py:114: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp Traceback (most recent call last): File "", line 1, in File "/tmp/pip-install-qro7kaiy/scipy_d9e7a8590f554c5c851b4993f896117b/setup.py", line 474, in setup_package() File "/tmp/pip-install-qro7kaiy/scipy_d9e7a8590f554c5c851b4993f896117b/setup.py", line 450, in setup_package from numpy.distutils.core import setup ModuleNotFoundError: No module named 'numpy' ---------------------------------------- ERROR: Failed building wheel for scipy Running setup.py clean for scipy ERROR: Command errored out with exit status 1: command: /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/rbocpdms/venv/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-qro7kaiy/scipy_d9e7a8590f554c5c851b4993f896117b/setup.py'"'"'; __file__='"'"'/tmp/pip-install-qro7kaiy/scipy_d9e7a8590f554c5c851b4993f896117b/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' clean --all cwd: /tmp/pip-install-qro7kaiy/scipy_d9e7a8590f554c5c851b4993f896117b Complete output (11 lines): /tmp/pip-install-qro7kaiy/scipy_d9e7a8590f554c5c851b4993f896117b/setup.py:114: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp `setup.py clean` is not supported, use one of the following instead: - `git clean -xdf` (cleans all files) - `git clean -Xdf` (cleans all versioned files, doesn't touch files that aren't checked into the git repo) Add `--force` to your command to use it anyway if you must (unsupported). ---------------------------------------- ERROR: Failed cleaning build dir for scipy Failed to build scipy Installing collected packages: six, pytz, python-dateutil, pyparsing, py, pluggy, numpy, more-itertools, kiwisolver, cycler, attrs, atomicwrites, scipy, pytest, matplotlib Running setup.py install for scipy ... \^canceled ERROR: Operation cancelled by user WARNING: You are using pip version 21.2.3; however, version 21.2.4 is available. You should consider upgrading via the '/opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/rbocpdms/venv/bin/python -m pip install --upgrade pip' command. make: *** [execs/python/rbocpdms/venv] Error 1 Test params bug No bugs. They type the arguments and the make_param_dict functions works as desired in their algorithm. .. code-block:: bash gary@mc11:~$ tail /tmp/TCPDBench.bocpdms.debug.parameters.txt parameters: {'intensity': 100.0, 'prior_a': 100.0, 'prior_b': 1.0, 'threshold': 100, 'use_timeout': True, 'S1': 1, 'S2': 1, 'intercept_grouping': None, 'prior_mean_scale': 0, 'prior_var_scale': 1, 'lower_AR': 1, 'upper_AR': 4} detector = run_bocpdms(mat, {'intensity': 100.0, 'prior_a': 100.0, 'prior_b': 1.0, 'threshold': 100, 'use_timeout': True, 'S1': 1, 'S2': 1, 'intercept_grouping': None, 'prior_mean_scale': 0, 'prior_var_scale': 1, 'lower_AR': 1, 'upper_AR': 4}) agrs: Namespace(input='/opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/datasets/well_log.json', intensity=200.0, output=None, prior_a=0.01, prior_b=100.0, threshold=100, use_timeout=True) defaults: {'S1': 1, 'S2': 1, 'intercept_grouping': None, 'prior_mean_scale': 0, 'prior_var_scale': 1, 'lower_AR': 1, 'upper_AR': 4} parameters: {'intensity': 200.0, 'prior_a': 0.01, 'prior_b': 100.0, 'threshold': 100, 'use_timeout': True, 'S1': 1, 'S2': 1, 'intercept_grouping': None, 'prior_mean_scale': 0, 'prior_var_scale': 1, 'lower_AR': 1, 'upper_AR': 4} detector = run_bocpdms(mat, {'intensity': 200.0, 'prior_a': 0.01, 'prior_b': 100.0, 'threshold': 100, 'use_timeout': True, 'S1': 1, 'S2': 1, 'intercept_grouping': None, 'prior_mean_scale': 0, 'prior_var_scale': 1, 'lower_AR': 1, 'upper_AR': 4}) agrs: Namespace(input='/opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/datasets/well_log.json', intensity=50.0, output=None, prior_a=0.01, prior_b=0.01, threshold=100, use_timeout=True) defaults: {'S1': 1, 'S2': 1, 'intercept_grouping': None, 'prior_mean_scale': 0, 'prior_var_scale': 1, 'lower_AR': 1, 'upper_AR': 4} parameters: {'intensity': 50.0, 'prior_a': 0.01, 'prior_b': 0.01, 'threshold': 100, 'use_timeout': True, 'S1': 1, 'S2': 1, 'intercept_grouping': None, 'prior_mean_scale': 0, 'prior_var_scale': 1, 'lower_AR': 1, 'upper_AR': 4} detector = run_bocpdms(mat, {'intensity': 50.0, 'prior_a': 0.01, 'prior_b': 0.01, 'threshold': 100, 'use_timeout': True, 'S1': 1, 'S2': 1, 'intercept_grouping': None, 'prior_mean_scale': 0, 'prior_var_scale': 1, 'lower_AR': 1, 'upper_AR': 4}) gary@mc11:~$ abed no source in shell I had to pass . instead of source otherwise abed taks failed. .. code-block:: bash [2021-08-17 08:21:33] Executing: 'source ./opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/bocpdms/venv/bin/activate && /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/bocpdms/venv/bin/python /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/cpdbench_bocpdms.py -i /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/datasets/well_log.json --intensity 50 --prior-a 1.0 --prior-b 1.0 --threshold 100 --use-timeout' Error: There was an error executing: 'source ./opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/bocpdms/venv/bin/activate && /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/bocpdms/venv/bin/python /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/cpdbench_bocpdms.py -i /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/datasets/well_log.json --intensity 100 --prior-a 0.01 --prior-b 0.01 --threshold 100 --use-timeout'. Here is the error: /bin/sh: 1: source: not found Error: There was an error executing: 'source ./opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/bocpdms/venv/bin/activate && /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/bocpdms/venv/bin/python /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/execs/python/cpdbench_bocpdms.py -i /opt/python_virtualenv/projects/TCPDBench-py386/TCPDBench/datasets/well_log.json --intensity 50 --prior-a 1.0 --prior-b 1.0 --threshold 100 --use-timeout'. Here is the error: .. code-block:: python # "best_bocpdms": ( # "source {execdir}/python/bocpdms/venv/bin/activate && python {execdir}/python/cpdbench_bocpdms.py -i {datadir}/{dataset}/{dataset}.json --intensity {intensity} --prior-a {prior_a} --prior-b {prior_b} --threshold 100 --use-timeout" # ), "best_bocpdms": ( ". {execdir}/python/bocpdms/venv/bin/activate && python {execdir}/python/cpdbench_bocpdms.py -i {datadir}/{dataset}/{dataset}.json --intensity {intensity} --prior-a {prior_a} --prior-b {prior_b} --threshold 100 --use-timeout" ), Heatmaps of results .. code-block:: python import pandas as pd algorithms_results = { 'best_cover_avg': { 'm66': 0.795455, 'bocpd': 0.789368, 'segneigh': 0.783592, 'binseg': 0.780245, 'amoc': 0.745989, 'bocpdms': 0.743791, 'pelt': 0.725448, 'ecp': 0.720113, 'kcpa': 0.625981, 'zero': 0.578767, 'prophet': 0.576222, 'cpnp': 0.552106, 'wbs': 0.428036, 'rfpop': 0.414154}, 'best_f1_avg': { 'bocpd': 0.879675, 'binseg': 0.855901, 'segneigh': 0.854902, 'm66': 0.841738, 'amoc': 0.798912, 'ecp': 0.796599, 'pelt': 0.787211, 'kcpa': 0.683167, 'cpnp': 0.665980, 'zero': 0.662375, 'bocpdms': 0.620328, 'prophet': 0.534355, 'wbs': 0.532729, 'rfpop': 0.530942}, 'default_cover_avg': { 'binseg': 0.705799, 'amoc': 0.701605, 'pelt': 0.688798, 'segneigh': 0.676410, 'bocpd': 0.636019, 'bocpdms': 0.633351, 'rbocpdms': 0.628623, 'zero': 0.582727, 'm66': 0.582623, 'prophet': 0.539869, 'cpnp': 0.535341, 'ecp': 0.522748, 'rfpop': 0.392427, 'wbs': 0.330206, 'kcpa': 0.061955}, 'default_f1_avg': { 'binseg': 0.744400, 'pelt': 0.709992, 'amoc': 0.703711, 'bocpd': 0.689622, 'segneigh': 0.675545, 'zero': 0.668967, 'm66': 0.648660, 'cpnp': 0.606694, 'ecp': 0.597710, 'bocpdms': 0.507137, 'rfpop': 0.499453, 'prophet': 0.487742, 'rbocpdms': 0.446714, 'wbs': 0.411706, 'kcpa': 0.111007, } } df = pd.DataFrame.from_dict(algorithms_results, orient='columns') df.style.background_gradient(cmap='Greens').background_gradient(cmap='Greens').set_properties(**{'font-size': '12px'}) df.style.highlight_max(color='lightblue', axis=0) def highlight_top3(s): result = [] is_large = s.nlargest(3).values is_small = s.nsmallest(3).values for i in s: if i in is_large: result.append('background-color: lightgreen') elif i in is_small: result.append('background-color: #FFCCCB') else: result.append('') return result df.style.apply(highlight_top3) # results and rank algorithms_results_and_rank = algorithms_results.copy() for score in list(algorithms_results.keys()): l = [] for algo in algorithms_results[score]: l.append([algo, algorithms_results[score][algo]]) sorted_l = sorted(l, key=lambda x: x[1], reverse=True) rank_dict = {} for algo in algorithms_results[score]: for index, item in enumerate(sorted_l): if item[0] == algo: rank_dict[algo] = index + 1 metric = '%s rank' % score algorithms_results_and_rank[metric] = rank_dict.copy() df = pd.DataFrame.from_dict(algorithms_results_and_rank, orient='columns') df.style.background_gradient(cmap='Greens').background_gradient(cmap='Blues_r', subset=['best_cover_avg rank','best_f1_avg rank','default_cover_avg rank','default_f1_avg rank']).set_properties(**{'font-size': '12px'})