Google Analytics for Streamlit application
Streamlit and Google Analytics
In the previous article, We deployed Streamlit on our server. Now, we probably want to introduce Google Analytics for it in order to analyze your access.
At first, I tried to output the Google Analytics code snippet in the Streamlit application by calling st.markdown(ga_code, unsafe_allow_html=True)
, but it was not output.
After that, I found the solution below. https://stackoverflow.com/questions/76034389/google-analytics-is-not-working-on-streamlit-application
So, I tried to apply this solution when starting the Docker Container.
How it applies
As explained on the Stack Overflow page above, this solution involves directly modifying the Streamlit source code and embedding the GA code snippet. As explained last time, We deploy Streamlit as a Docker container. Therefore, we edit Streamlit source code for GA when the container starts.
At first, it needs to install beautifulsoup4
and lxml
for editting index.html in Streamlit.
requirements.txt:
(...snip...)
beautifulsoup4==4.12.3
lxml==5.3.0
And add a python scirpt as setup_ga.py
. (Please replace "G-xxxxxxxxxx" with your GA code)
import pathlib
from bs4 import BeautifulSoup
import logging
import shutil
import streamlit as st
def inject_ga():
GA_ID = "google_analytics"
GA_JS = """
<!-- Google tag (gtag.js) -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-xxxxxxxxxx"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-xxxxxxxxxx');
</script>
"""
# Insert the script in the head tag of the static template inside your virtual
index_path = pathlib.Path(st.__file__).parent / "static" / "index.html"
logging.info(f'editing {index_path}')
soup = BeautifulSoup(index_path.read_text(), features="lxml")
if not soup.find(id=GA_ID): # if cannot find tag
bck_index = index_path.with_suffix('.bck')
if bck_index.exists():
shutil.copy(bck_index, index_path) # recover from backup
else:
shutil.copy(index_path, bck_index) # keep a backup
html = str(soup)
new_html = html.replace('<head>', '<head>\n' + GA_JS)
index_path.write_text(new_html)
if __name__ == "__main__":
inject_ga()
Then, add "entry-point.sh" for executing "setup_ga.py" on starting your container.
#!/bin/bash
set -e
python3 ./setup_ga.py
streamlit run app.py --server.baseUrlPath=/pdf-to-csv --server.port=8501 --server.address=0.0.0.0
At last, update Dockerfile for copying all files and calling "entry-point.sh" on starting the container.
FROM python:3.9-slim
WORKDIR /app
RUN apt-get update && apt-get install -y \
build-essential \
curl \
software-properties-common \
git \
openjdk-17-jre \
&& rm -rf /var/lib/apt/lists/*
COPY ./app.py .
COPY ./setup_ga.py .
COPY ./requirements.txt .
COPY ./entry-point.sh .
RUN chmod 755 ./entry-point.sh
RUN pip3 install -r requirements.txt
EXPOSE 8501
ENTRYPOINT ["/app/entry-point.sh"]
Then, create a container image and deploy it, which will output the GA code snippet inside the head tag as follows:
Conclusion
I introduced how to embed a GA code snippet in Streamlit. This allows you to analyze web access to the Streamlit application with GA. It's a bit of a trivial way, isn't it? I hope that in the future, it will be possible to embed arbitrary code inside the head tag of Streamlit.