<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
  <title>Posts tagged with “Docker” on Mark van Lent’s weblog</title>
  <updated>2026-01-31T00:00:00+00:00</updated>
  <link rel="self" type="application/atom+xml" href="https://markvanlent.dev/tags/docker/index.xml" hreflang="en"/>
  <id>tag:markvanlent.dev,2010-04-02:/tags/docker/index.xml</id>
  <link rel="alternate" type="text/html" href="https://markvanlent.dev/tags/docker/" hreflang="en"/>
  <author>
      <name>Mark van Lent</name>
      <uri>https://markvanlent.dev/about/</uri>
    </author>
  <rights>Copyright (c) Mark van Lent, Creative Commons Attribution 4.0 International License.</rights>
  <icon>https://markvanlent.dev/favicon.ico</icon>
  <entry>
    <title type="html"><![CDATA[FOSDEM 2026]]></title>
    <link rel="alternate" href="https://markvanlent.dev/2026/01/31/fosdem-2026/" type="text/html" />
    <id>https://markvanlent.dev/2026/01/31/fosdem-2026/</id>
    <author>
      <name>map[name:Mark van Lent uri:https://markvanlent.dev/about/]</name>
    </author>
    <category term="ansible" />
    <category term="conference" />
    <category term="docker" />
    <category term="infrastructure as code" />
    <category term="git" />
    <category term="python" />
    <category term="security" />
    
    <updated>2026-02-01T14:54:22Z</updated>
    <published>2026-01-31T00:00:00Z</published>
    <content type="html"><![CDATA[<p>January is already almost over, so time for <a href="https://fosdem.org/2026/">FOSDEM</a>,
the yearly <q>free event for software developers to meet, share ideas and
collaborate</q> in Brussels. <a href="/2025/02/01/fosdem-2025/">Last year</a> I
focussed on the Go track, this year I selected a mix of security and Python
related talks to attend.</p>
<h2 id="streamlining-signed-artifacts-in-container-ecosystems--tonis-tiigi">Streamlining Signed Artifacts in Container Ecosystems &mdash; Tonis Tiigi</h2>
<p>It&rsquo;s possible to sign Docker images, but at the moment most are actually not
signed. Also, users should understand what the signature is protecting and what
it&rsquo;s <em>not</em> protecting. We should not want signing just to tick a box on the
security checlist, but because of the security it adds. And we need something
simple: integrated with existing tools, should not slow down tools.</p>
<p>Buildkit powers &ldquo;<code>docker build</code>&rdquo; but is not limited to Dockerfiles. It&rsquo;s high
performance, can build complex builds and has caching.</p>
<p>A modern build is a graph of images, Git repositories, local files, etc. The
results are images, binaries, archives.</p>
<figure><img src="/images/fosdem2026_tonis_tiigi.jpg"
    alt="Photo of Tonis Tiigi explaining the graph that is modern software building"><figcaption>
      <p>Tonis Tiigi explaining that builds of modern software are a complex graph</p>
    </figcaption>
</figure>

<p>We need Supply-chain Levels for Software Artifacts (SLSA) provenance: what has
actually happened in the build? What was the build config? Et cetera. It&rsquo;s useful to
figure out how an artifact was built.</p>
<p>Buildkit does not sign images by default. GitHub has <a href="https://docs.github.com/en/packages/managing-github-packages-using-github-actions-workflows/publishing-and-installing-a-package-with-github-actions#publishing-a-package-using-an-action">an example in the
documentation</a>
to run a build with Buildkit and generate an artifact. It claims to generate an
<q>unforgeable statement</q>. But if your GitHub credentials are
leaked and the attacker can get your hands on the temporary signing key, they can
use it to sign their own artifacts.</p>
<p>Docker created the <a href="https://github.com/docker/github-builder">github-builder</a>
repository. It contains reusable GitHub Actions to securely build images. If you
use this, your images are signed to prove that they were built from a certain
repository, using the configured build steps. Where Buildkit (among other
things) provides isolation, <code>github-builder</code> provides signing context. It also
protects against build dependency leaks.</p>
<p>So that takes care of the signatures, but how do you verify them?</p>
<ul>
<li>The command &ldquo;<code>docker inspect</code>&rdquo; now shows verified signatures</li>
<li>You can manually verify it with <a href="https://github.com/sigstore/cosign">cosign</a></li>
<li>You can also use sigstore/policy-controller for Kubernetes</li>
</ul>
<p>Buildx also includes experimental Rego (Open Policy Agent) policy support. This
means you can write a matching policy for <code>Dockerfile</code>, e.g. <code>Dockerfile.rego</code>,
which is then automatically loaded. All build sources now need to pass policy
for the build to continue (images, Git repositories, URLs, etc).</p>
<p>You can do very complex stuff in the policies. As simple example Tonis showed:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rego" data-lang="rego"><span class="line"><span class="cl"><span class="kd">package</span><span class="w"> </span><span class="nx">docker</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="n">allow</span><span class="w"> </span><span class="kd">if</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nx">input</span><span class="o">.</span><span class="nx">image</span><span class="o">.</span><span class="nx">repo</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">&#34;org/app&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nf">docker_github_builder_tag</span><span class="p">(</span><span class="nx">input</span><span class="o">.</span><span class="nx">image</span><span class="o">,</span><span class="w"> </span><span class="s2">&#34;org/app&#34;</span><span class="o">,</span><span class="w"> </span><span class="nx">input</span><span class="o">.</span><span class="nx">image</span><span class="o">.</span><span class="nx">tag</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This policy should make sure that the image can only be built from this
repository and that the image tag should match the Git tag.</p>
<p>Summary:</p>
<ul>
<li>No reason not to sign</li>
<li>Not all signatures are equal</li>
<li>Software pulling packages should verify pulled content</li>
</ul>
<p><a href="https://fosdem.org/2026/schedule/event/HJAJTU-streamlining_signed_artifacts_in_container_ecosystems/">Link to the conference page</a></p>
<h2 id="sequoia-git-making-signed-commits-matter--neal-h-walfield">Sequoia git: Making Signed Commits Matter &mdash; Neal H. Walfield</h2>
<p>Version control systems (also known as VCSs) track the following:</p>
<ul>
<li>Changes to the code</li>
<li>Authorship</li>
<li>Other metadata</li>
<li>Commit message</li>
</ul>
<p>But the author can be faked: the metadata is set by the author, including the
author&rsquo;s name. After a quick &ldquo;<code>git config</code>&rdquo; command you can commit as anyone you
want, for example <a href="https://en.wikipedia.org/wiki/Linus_Torvalds">Linus Torvalds</a>.
Sure, GitHub could see that the committer (the one pushing the commit) and
author are different. However, this is not necessarily bad because we might
simply want to give proper attribution to the author of the commit.</p>
<p>And in theory the forge might also be compromised, or someone may have gotten
permission to push to the project.</p>
<p>To prevent impersonations, we can cryptographically prove who the author is by
signing the commits. But now the problem shifts to the certificates. Because
anyone can create a key with any name (again, for example Linus) attached to it.
So what does a signed commit mean now?</p>
<p>How can we be sure that the author is who they say they are? There are ways:</p>
<ul>
<li>You could talk to developer the verify</li>
<li>You could go to <a href="https://en.wikipedia.org/wiki/Key_signing_party">key signing parties</a></li>
<li>You can use a central authority that you trust (e.g.
<a href="https://keys.openpgp.org/">keys.openpgp.org</a>, the Linux developer keyring,
the <code>distributions-gpg-keys</code> package, or, if you trust Github, use
<code>github.com/&lt;username&gt;.gpg</code>)</li>
</ul>
<p>You can use the following command to show the Git log and the signatures on them:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">git log --show-signature
</span></span></code></pre></div><p>But now you need to actually check that the signatures are indeed made by the
certificates you trust.</p>
<p>It&rsquo;s up to the maintainers of the software to curate a list of contributors and
track when contributors join and leave (yes, there is a temporal element as
well). This is hard. Maintainer needs tooling. And you would want to detect
unauthorized commits (impersonation, a malicious forge, a machine in the middle
or for instance when project is given to a new maintainer by a forge/registry).</p>
<p>What does the solution look like?</p>
<ul>
<li>Clear semantics</li>
<li>The project itself maintains signing policy</li>
<li>Third party uses maintainers&rsquo; policy to authenticate project</li>
<li>Verification, not attestation: do not rely on any external authority</li>
</ul>
<p>(Note that the maintainers can still be socially engineered to include the key
of an attacker in their policy. So they still have to be careful about who is
added to the policy.)</p>
<p>Sequoia git provides:</p>
<ul>
<li>Specification</li>
<li>Config</li>
<li>Tooling</li>
</ul>
<p>With <a href="https://gitlab.com/sequoia-pgp/sequoia-git">Sequoia git</a> (which part of
the <a href="https://sequoia-pgp.org/">Sequoia PGP project</a>) you can have a signing
policy in an <code>openpgp-policy.toml</code> file in the project&rsquo;s Git repository. It
specifies users, their keys and their capabilities. You can use <code>sg-git</code> to help
maintain this file.</p>
<p>For instance to add user Alice and then describe the current policy, you can use
the following commands:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">sq-git policy authorize alice --committer &lt;cert&gt;
</span></span><span class="line"><span class="cl">sq-git policy describe
</span></span></code></pre></div><p>A commit is &ldquo;authenticated&rdquo; if at least one parent commit says the commit is
acceptable (via the policy). To verify that there is an authenticated path from
the current state back to a certain commit we trust, use this command:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">sq-git log --trust-root &lt;sha of trusted commit&gt;
</span></span></code></pre></div><p>Projects may have contributions from others that are not included in the policy.
To maintain an authenticated path when accepting the contribution, a trusted
author needs to merge the contribution via a merge commit that <em>is</em>
authenticated. (You may need to use the &ldquo;<code>--no-ff</code>&rdquo; on the merge to make sure
there is a merge commit though.)</p>
<p><a href="https://fosdem.org/2026/schedule/event/KFSUCW-sequoia-git/">Link to the conference page</a></p>
<h2 id="an-endpoint-telemetry-blueprint-for-security-teams--victor-lyuboslavsky">An Endpoint Telemetry Blueprint for Security Teams &mdash; Victor Lyuboslavsky</h2>
<p>With open source we can inspect something that is broken, we can change the
defaults. With security we are used to the opposite; it&rsquo;s a black box. We are
not used to owning the data. The data exists on the endpoints, but ownership is
transferred to a different team. How can we add more security in a way engineers
understand and can use?</p>
<p>Victor presents a blueprint with the following layers:</p>
<ul>
<li>Endpoint agents</li>
<li>Control layer</li>
<li>Ingestion, streaming &amp; storage</li>
<li>Detection</li>
<li>Correlation, intelligence and response</li>
</ul>
<p>The value is not in the layers themselves, but the boundaries. For example, the
ingestion should move the data reliably but should not care which tool collected
it. This makes them loosely coupled.</p>
<p>For endpoint agents Victor suggests
<a href="https://github.com/osquery/osquery">osquery</a> which allows basic questions about
endpoints. Data is structured and consistent. It aligns with open source values.
(Alternatives: scripts &amp; cron, log shippers like filebeat or tools like auditd
or Event Tracing for Windows.)</p>
<p>Controlling the data (the next layer) means that you want to have:</p>
<ul>
<li>Central config</li>
<li>Live queries</li>
<li>Consistent schemas</li>
</ul>
<p><a href="https://github.com/fleetdm/fleet">Fleet</a> (disclaimer: Victor works here) is
built to manage <code>osquery</code> at scale and a good candidate for this layer.</p>
<p>The control layer needs to work hand-in-hand with ingestion layer. The ingestion
layer moves data to downstream system. E.g. <a href="https://github.com/vectordotdev/vector">Vector</a> or
<a href="https://www.elastic.co/logstash">Logstash</a> can be used here.</p>
<blockquote>
<p>Ingestion isn&rsquo;t where you get clever. It&rsquo;s where you get reliable.</p></blockquote>
<p>Streaming decouples users from consumers and e.g. allows replay. Note that this
is an optional step and it would come <em>after</em> ingestion, not <em>in place of</em> it.
For instance <a href="https://kafka.apache.org/">Apache Kafka</a> can be used in this
layer. Ingestion absorbs the mess. Streaming preserves flexibility.</p>
<p>The storage layer is where telemetry becomes durable. It&rsquo;s about being able to
ask hard questions later. Examples of useful tools:
<a href="https://github.com/ClickHouse/ClickHouse">ClickHouse</a>,
<a href="https://www.elastic.co/elasticsearch">Elasticsearch</a> (which is better at text
search) and <a href="https://github.com/apache/iceberg">Iceberg</a> (which is slower for
active investigation).</p>
<p>For the detection layer you might want to use
<a href="https://github.com/SigmaHQ/Sigma">Sigma</a>. It provided portability. Rules are
translated to native SQL running on ClickHouse. Intent (Sigma signatures)
becomes execution (SQL query to get the data).</p>
<p>Finally the correlation layer: <a href="https://github.com/grafana/grafana">Grafana</a>
can be used for correlation and visualisation. Grafana can query ClickHouse.
Grafana also has alerting.</p>
<p>Note that response isn&rsquo;t just about automation. It&rsquo;s also to pause and ask
better questions. The correlation layer should focus on enabling humans to act.</p>
<p>Open endpoint telemetry is <strong>not</strong> an &ldquo;EDR killer&rdquo;. It does not replace it. It adds
diversity and complements other tools. It provides a second set of eyes.</p>
<p><a href="https://fosdem.org/2026/schedule/event/HYXTPH-endpoint-telemetry-blueprint/">Link to the conference page</a></p>
<h2 id="the-bakery-how-pep810-sped-up-my-bread-operations-business--jacob-coffee">The Bakery: How PEP810 sped up my bread operations business &mdash; Jacob Coffee</h2>
<p>Python loads imports eagerly by default. This leads to memory bloat and cold
start issues. Explicit lazy imports (see
<a href="https://peps.python.org/pep-0810/">PEP 810</a>) only import a module when it&rsquo;s
first accessed not when the import statement is executed.</p>
<p>Lazy import is scheduled to be included in Python 3.15 and looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="n">lazy</span> <span class="kn">import</span> <span class="nn">foo</span> <span class="kn">from</span> <span class="nn">bar</span>
</span></span></code></pre></div><p>The design principles applied are that lazy imports are:</p>
<ul>
<li>Explicit</li>
<li>Local</li>
<li>Granular</li>
</ul>
<p>When parsing the Python code a proxy module is created. Only when the module is
actually used, the proxy is transparently replaced by the real package. You will
not always see improvements, so do not blindly replace all imports with lazy
imports.</p>
<p>PEP 810 also eliminates the need for <code>TYPE_CHECKING</code> guards. (See the <a href="https://docs.python.org/3/library/typing.html#typing.TYPE_CHECKING">typing
docs</a>, in
short: importing a module that is expensive and only contains types used for
type checking in an &ldquo;<code>if TYPE_CHECKING:</code>&rdquo; block.) It also helps for faster test
discovery and collection, less memory usage, decrease cold start slowness in
e.g. AWS Lambda functions, CLI applications, etc.</p>
<p>Meta (with Cinder) saw a 70% startup time reduction and 40% memory savings.
PySide has a 35% startup improvement.</p>
<p>About CLI tools: when using lazy imports you might notice the difference when
using <code>--help</code>. There&rsquo;s no need to load all dependencies to just output the help
text of a tool.</p>
<p>Some notes:</p>
<ul>
<li>Import time side effects (e.g. logging configuration, DB connections) are also
delayed!</li>
<li>Type checkers need to be updated</li>
<li>Import errors move to first use (so in runtime, not at launch). Keep that in
mind when debugging</li>
<li>It&rsquo;s not always faster, so profile your application before migrating and see
where you can potentially benefit</li>
<li>Document your lazy imports!</li>
<li>You cannot do lazy imports in functions</li>
</ul>
<p>Circular imports are probably still a problem, but they just show up later.</p>
<p><a href="https://github.com/JacobCoffee/breadctl">Link to the repo for this talk</a></p>
<p><a href="https://fosdem.org/2026/schedule/event/HAAABD-the_bakery_how_pep810_sped_up_my_bread_operations_business/">Link to the conference page</a></p>
<h2 id="modern-python-monorepo-with-uv-workspaces-prek-and-shared-libraries--jarek-potiuk">Modern Python monorepo with <code>uv</code>, <code>workspaces</code>, <code>prek</code> and shared libraries &mdash; Jarek Potiuk</h2>
<p>Jarek is, besides his other roles, the number 1 Apache Airflow contributor. The
<a href="https://github.com/apache/airflow">Apache Airflow repo</a> is the monorepo he
talks about today. There is also a series of blog posts about this topic: see
<a href="https://medium.com/apache-airflow/modern-python-monorepo-for-apache-airflow-part-1-1fe84863e1e1">part 1</a>,
which links to the other parts.</p>
<p>Airflow drove early requirements for
<a href="https://docs.astral.sh/uv/concepts/projects/workspaces/">uv workspaces</a>. They now
manage 120+ distributions seamlessly with it. It allows them to combine
distributions to work together in a workspace. Also used to import from one
distribution in another one.</p>
<p>The project shares a single virtual environment used by <code>uv</code> in root of project.
If you run &ldquo;<code>uv sync</code>&rdquo; from the top level you get everything. If you run it in a
subdirectory (e.g. <code>airflow-core</code>) you only get what is needed for that
distribution.</p>
<p>Benefits of the <code>uv</code> workspaces:</p>
<ul>
<li>Isolated</li>
<li>Explicit</li>
<li>Flexible</li>
</ul>
<p><a href="https://hatch.pypa.io/1.12/">Hatch</a> has (or will have, at the time of writing)
largely compatible workspaces.</p>
<p>However <a href="https://pre-commit.com/">pre-commit</a> became a bottleneck. They needed
to run 170+ pre commit hooks <strong>on every commit</strong>.
<a href="https://github.com/j178/prek">Prek</a> is drop-in replacement for pre-commit and
works fantastic. It is optimized for speed and monorepos.</p>
<p>Airflow uses symlinked shares libraries (where a shared lib is also a
distribution). The Hatchling build backend needs to replace links with physical
copies during packaging. They use Prek to maintain consistency.</p>
<p><code>uv sync</code> detects conflicts between merged requirements files and Prek hooks
enforce relative imports in shared code to prevent cross coupling issues (IIRC)</p>
<p><a href="https://fosdem.org/2026/schedule/event/WE7NHM-modern-python-monorepo-apache-airflow/">Link to the conference page</a></p>
<h2 id="pyinfra-because-your-infrastructure-deserves-real-code-in-python-not-yaml-soup--loïc-wowi42-tosser">PyInfra: Because Your Infrastructure Deserves Real Code in Python, Not YAML Soup &mdash; Loïc &ldquo;wowi42&rdquo; Tosser</h2>
<p>Loïc is a Frenchmen (which, as he himself states, means he <strong>must</strong> have
opinions) and not a YAML fan to put it mildly. That is: YAML as a programming
language, e.g. how it is used in <a href="https://github.com/ansible/ansible">Ansible</a>.</p>
<figure><img src="/images/fosdem2026_loic_tosser.jpg"
    alt="Photo of Loïc Tosser showing a complex Ansible task in YAML"><figcaption>
      <p>Loïc Tosser demonstrating what happens when you ask a config file to be a programming language</p>
    </figcaption>
</figure>

<p><a href="https://pyinfra.com/">PyInfra</a> is an infrastructure as code library to write
Python code which is then translated to shell scripts to run on the target
hosts. So, in contrast to Ansible, you do not need Python on the target. The
target machine only needs SSH and a POSIX shell. You can also configure Docker
containers with PyInfra.</p>
<blockquote>
<p>If it has SSH, PyInfra can talk to it.</p></blockquote>
<p>PyInfra has idempotent operations and built-in diff checking. Declarative
infrastructure with actual code and not YAML. You can use inventory from
Terraform, Coolify or any API.</p>
<p>You can leverage the entire Python packaging ecosystem. Slack integration? Just
use the right Python package.</p>
<p>PyInfra is not only a CLI tool, you can also use it as a library.</p>
<p>PyInfra is 10 times faster than Ansible, uses 70% less code, has proper code
reuse via <code>import</code> and proper loops instead of <code>with_items</code>. It can have actual
unit tests and can scale to thousands of servers. Also you no longer have error
messages stating that <q>the error appears to be in &hellip; <strong>but may be
elsewhere in the file</strong> &hellip;</q> (looking at you Ansible). PyInfra has
clear error messages without having to specify <code>-vvvv</code> and wading through
hundreds of lines of output.</p>
<p>The suggested migration path:</p>
<ul>
<li>Start small, one playbook at a time</li>
<li>Use your IDE for autocomplete and refactoring</li>
<li>Leverage Python&rsquo;s standard library and the ecosystem with all its packages</li>
<li>Sleep better because you don&rsquo;t have to debug at 3 AM.</li>
</ul>
<p>Is PyInfra production ready? Yes! It has a stable API, is already in use in
production, it&rsquo;s actively maintained and is MIT licensed (so no commercial
entity behind it to steer its direction).</p>
<p>You can get started today with a simple &ldquo;<code>pip install pyinfra</code>&rdquo;.</p>
<p><a href="https://fosdem.org/2026/schedule/event/VEQTLH-infrastructure-as-python/">Link to the conference page</a></p>
<p>(Note from me, Mark, I found Loïc a great speaker: he has lots of energy, is
funny and can transfer his enthusiasm to the room. If the topic interests you
and the video becomes available, I would recommend watching this talk as a great
sales pitch to get started with PyInfra.)</p>
<h2 id="ducks-to-the-rescue---etl-using-python-and-duckdb--marc-andré-lemburg">Ducks to the rescue - ETL using Python and DuckDB &mdash; Marc-André Lemburg</h2>
<p>ETL stands for Extract, Transform, Load. Nowadays we usually do Extract, Load,
Transform because databases are efficient in processing.</p>
<p>DuckDB is open source, in-process analytics data storage (OLAP). It is similar
to SQLite, but for OLAP workloads. It has great Python support and uses SQL as
standard query language. It&rsquo;s pip installable, column based
(<a href="https://arrow.apache.org/">Apache Arrow</a>). It&rsquo;s single writer but allows for
multiple readers, so it&rsquo;s not a distributed database.</p>
<p><a href="https://github.com/pola-rs/polars">Polars</a>&rsquo; streaming can help with processing
your data as a line-by-line stream so you don&rsquo;t have to load the whole file in
memory at once.</p>
<p>Example to load a CSV file into DuckDB extremely fast:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">read_csv</span><span class="p">(...)</span><span class="w">
</span></span></span></code></pre></div><p>You can load the data into staging tables first to prepare everything and not
mess up e.g. existing data. You can then transform data in DuckDB, e.g. filter
out unneeded and duplicate data, validate data, fill in missing data, convert
data types, etc. You can do the transforms in SQL. You can even use native
integrations to write to PostgreSQL, MySQL, etc. Or worst case stream to Python.</p>
<p>Guidelines:</p>
<ul>
<li>Know your queries, that is: know how your data is going to be used</li>
<li>Use the Pareto principle (80/20 rule): optimize for queries that are used
often</li>
<li>Keep a healthy balance between performance and space requirements (which are
often trade-offs)</li>
</ul>
<p>Huge datasets: use the <a href="https://github.com/duckdb/ducklake">DuckLake</a> extension.</p>
<p>To get started: &ldquo;<code>uv add duckdb</code>&rdquo;. Do some experiments and see how it works for
you.</p>
<p><a href="https://fosdem.org/2026/schedule/event/S7RELZ-ducks_to_the_rescue_-_etl_using_python_and_duckdb/">Link to the conference page</a></p>
<h2 id="my-takeaways">My takeaways</h2>
<ul>
<li>Yes, FOSDEM is crowded and you may not be able to get into every talk you want
to see in person, but it&rsquo;s still nice to be there. It&rsquo;s well organised and
there&rsquo;s a friendly atmosphere. Lots of interesting projects to see and people
to talk to. And it&rsquo;s convenient if you want to sponsor your favorite projects
by buying some merchandise.</li>
<li>It&rsquo;s worth investigating signing Docker images (in the right way) further.</li>
<li>Lazy imports look useful! Once Python 3.15 lands it&rsquo;s worth doing profiling on
the projects I work on to see if we can use those to speed things up on
startup and save some memory.</li>
<li>At work we recently decided to go for a monorepo for a project. I want to see
if/how <code>uv</code> workspaces and <code>prek</code> can help us.</li>
<li>I&rsquo;ve written a bunch of Ansible roles to configure my humble homelab and
laptop. Perhaps it&rsquo;s time to switch to PyInfra? It sounds promising and might
be worth the investment of migrating to.</li>
</ul>
<h2 id="about-the-trip">About the trip</h2>
<p><figure class="float-right"><img src="/images/fosdem2026_atomium.jpg"
    alt="Picture of the Atomium at night" width="200px"><figcaption>
      <p>The <a href="https://en.wikipedia.org/wiki/Atomium">Atomium</a> at night</p>
    </figcaption>
</figure>

Last year I drove to Brussels on Friday and stayed at the city center in the
<a href="https://cityboxhotels.com/hotels/brussels/citybox-brussels">Citybox Brussels
hotel</a> for one
night, since I had to be home on Sunday. The upside: it was just a short (15
minute?) tram ride to the FOSDEM location. Unfortunately it did mean I had to
drive home that evening.</p>
<p>This year I had more time, so I booked a room at
<a href="https://www.falkohotel.be/">Falko Hotel</a> for two nights. It&rsquo;s about a 20&ndash;30
minute drive (depending on traffic) to the <a href="https://www.interparking.be/en/parkings/brussels/toison-d-or/">parking
garage</a> I used.
And from there about 20 minutes with pubic transport to the Université libre de
Bruxelles.</p>
<p>Staying another night meant I had more time for sightseeing, had the time to
write this post from my notes and could drive home well rested the next day.</p>
<p>As for tech: besides a phone and laptop, I also brought along two items that
made the trip more comfortable:</p>
<ul>
<li>A <a href="https://mojogear.eu/en/products/mojogear-mini-evo-10-000-mah-power-bank-22-5w">MOJOGEAR Mini
Evo</a>
powerbank to give my phone extra juice to make it through the day. With 10.000
mAh and up to 22.5W of power it&rsquo;s more than sufficient for a day at a
conference. With its small size and less than 175 grams in weight, it&rsquo;s also
easy to carry around.</li>
<li>A <a href="https://www.gl-inet.com/products/gl-sft1200/">GL.iNet Opal (GL-SFT1200)</a>
travel router. I plug it in, hook it up to the hotel internet, start a VPN
connection and all my other devices automatically connect to it and can use
the internet without the hotel snooping on my traffic. (Not that I have an
indication that my hotel would do that, but theoretically they could if I
would not use a VPN.)</li>
</ul>]]></content>
  </entry>
  <entry>
    <title type="html"><![CDATA[Open tabs — December 2022]]></title>
    <link rel="alternate" href="https://markvanlent.dev/2022/12/30/open-tabs-december-2022/" type="text/html" />
    <id>https://markvanlent.dev/2022/12/30/open-tabs-december-2022/</id>
    <author>
      <name>map[name:Mark van Lent uri:https://markvanlent.dev/about/]</name>
    </author>
    <category term="book" />
    <category term="docker" />
    <category term="homelab" />
    <category term="restic" />
    <category term="security" />
    <category term="tabs" />
    
    <updated>2025-09-13T21:07:32Z</updated>
    <published>2022-12-30T00:00:00Z</published>
    <content type="html"><![CDATA[<p>The end of the year is a nice time to review my open tabs on my phone and
computer to see what&rsquo;s worth saving and what is not. So here is
<a href="/tags/tabs/">another round</a>.</p>
<p>Note that I do not necessarily endorse the articles or applications I link to.
Most of the links to tools are here specifically because they seem interesting
to me, but I have no actual experience with them&mdash;hence the need for a reminder
on this list.</p>
<p>I have tried to group the links somewhat, but other than that they are listed in
more or less random order.</p>
<h2 id="development">Development</h2>
<dl>
<dt><a href="https://daniel.feldroy.com/posts/autodocumenting-makefiles">Autodocumenting Makefiles</a></dt>
<dd>A nice trick to document your <code>Makefile</code>.
This article was also discussed on <a href="https://news.ycombinator.com/item?id=30137254">Hacker News</a>.</dd>
<dt><a href="https://github.com/TomNomNom/gron">gron</a></dt>
<dd>From the <code>README</code>: <q>gron transforms JSON into discrete assignments to make it
easier to grep for what you want and see the absolute &lsquo;path&rsquo; to it. It eases the
exploration of APIs that return large blobs of JSON but have terrible
documentation.</q></dd>
<dt><a href="https://asdf-vm.com/">asdf</a></dt>
<dd>A version manager for e.g. Ruby, Node.js, Python.</dd>
<dt><a href="https://sharats.me/posts/shell-script-best-practices/">Shell Script Best Practices</a></dt>
<dd>Some rules of thumb for writing shell scripts which were also
<a href="https://news.ycombinator.com/item?id=33354286">discussed on Hacker News</a>.</dd>
<dt><a href="https://levelup.gitconnected.com/how-to-change-git-default-branch-from-master-3933afab08f9">How to change git default branch from master</a></dt>
<dd>I had to (or wanted to) switch from using the name &ldquo;master&rdquo; for my main branch
to something else (&ldquo;main&rdquo; in most cases) for a couple of Git repositories. It is
not hard, but if you do not do it often, it is convenient to have a guide like this
to make sure you do not forget anything.</dd>
</dl>
<h2 id="blogs">Blogs</h2>
<dl>
<dt><a href="https://bitfieldconsulting.com/">Bitfield Consulting</a></dt>
<dd>I&rsquo;m linking the whole website here since it has a bunch of nice <a href="https://bitfieldconsulting.com/golang">Go related
articles</a> but also interesting articles
in the <a href="https://bitfieldconsulting.com/blog">blog</a>.</dd>
<dt><a href="https://blog.kronis.dev/articles">Kristiāns Kronis&rsquo; blog</a></dt>
<dd>I have a couple of articles on this blog still open to (finish) reading, like
<a href="https://blog.kronis.dev/articles/using-ubuntu-as-the-base-for-all-of-my-containers">Using Ubuntu as the base for all of my containers</a>,
<a href="https://blog.kronis.dev/tutorials/moving-from-gitlab-ci-to-drone-ci">Moving from GitLab CI to Drone CI</a> and
<a href="https://blog.kronis.dev/articles/on-burnout">On burnout</a>.</dd>
<dt><a href="https://www.vharmers.com/">Valentine&rsquo;s blog</a></dt>
<dd>Informative blog of which I still want to read the last two articles in the
<a href="https://www.vharmers.com/tags/opsec/">OpSec blog series</a>.</dd>
<dt><a href="https://www.linuxserver.io/blog">linuxserver.io blog</a></dt>
<dd>A blog by the community that maintains &ldquo;the largest collection of Docker
images on the web&rdquo; (their words).</dd>
</dl>
<h2 id="security">Security</h2>
<dl>
<dt><a href="https://www.goldfiglabs.com/guide/personal-infosec-security-checklist/">The Personal Infosec &amp; Security Checklist</a></dt>
<dd>Actionable best practices to harden your security posture.</dd>
<dt><a href="https://security-list.js.org/#/">Personal security checkist</a></dt>
<dd>Tips for protecting your digital security and privacy.</dd>
<dt><a href="https://defensivecomputingchecklist.com/">A Defensive Computing Checklist</a></dt>
<dd>Another list of tips on how to make your digital life more safe.</dd>
<dt><a href="https://routersecurity.org/">Router Security</a></dt>
<dd>A site with the focus on the security of routers. From the same author as the
previous link.</dd>
<dt><a href="https://aegis-icons.github.io/">Aegis-icons</a></dt>
<dd>Unofficial set of icons for the <a href="https://getaegis.app/">Aegis Authenticator</a>
application.</dd>
<dt><a href="https://ppn.snovvcrash.rocks/">Pentester&rsquo;s Promiscuous Notebook</a></dt>
<dd>Notes by and for a pentester.</dd>
</dl>
<h2 id="homelab">Homelab</h2>
<dl>
<dt><a href="https://github.com/BaptisteBdn/docker-selfhosted-apps">BaptisteBdn/docker-selfhosted-apps</a></dt>
<dd>A GitHub repository with guides on how to run a bunch of applications via
Docker.</dd>
<dt><a href="https://petersem.github.io/dockerholics/">Dockerholics Application List</a></dt>
<dd>Another list of applications you can host yourself using Docker containers.</dd>
<dt><a href="https://github.com/awesome-selfhosted/awesome-selfhosted">Awesome-Selfhosted</a></dt>
<dd>Yet another (<em>the</em>?) list of applications you can run yourself.</dd>
<dt><a href="https://github.com/awesome-foss/awesome-sysadmin">Awesome Sysadmin</a></dt>
<dd>A list of Free and Open-Source sysadmin resources.</dd>
<dt><a href="https://containrrr.dev/watchtower/">Watchtower</a></dt>
<dd>Automatically update your Docker containers if newer images are available.</dd>
<dt><a href="https://crazymax.dev/diun/">Diun</a></dt>
<dd>If you do not like the idea of automatically updating your containers with
Watchtower, you might want to look at this <strong>D</strong>ocker <strong>I</strong>mage <strong>U</strong>pdate
<strong>N</strong>otifier application.</dd>
<dt><a href="https://www.drone.io/">Drone</a></dt>
<dd>I&rsquo;m already running a <a href="https://gitea.io/en-us/">Gitea</a> instance and a
<a href="https://hub.docker.com/_/registry">Docker Registry</a>. Drone might be a nice
third component to automatically build projects and e.g. create Docker images
and push them to my internal registry.</dd>
<dt><a href="https://homelab.khuedoan.com/">Khue&rsquo;s Homelab</a></dt>
<dd>Khue Doan has a project to provision, operate and update his homelab. As such
it is a nice inspiration.</dd>
<dt><a href="https://grafana.com/oss/loki/">Grafana Loki</a></dt>
<dd>A log aggregation system that looks like a useful addition to my setup, since
I&rsquo;m already running <a href="https://grafana.com/grafana/">Grafana</a> to visualise some
metrics.</dd>
<dt><a href="https://vector.dev/">Vector</a></dt>
<dd>This also looks like a interesting tool to collect logs.</dd>
<dt><a href="https://github.com/smallstep/certificates">Step Certificates</a></dt>
<dd>I am already using a private certificate authority to create certificates for
the services in my homelab, but it would be nice to have a self hosted
<a href="https://www.rfc-editor.org/rfc/rfc8555">ACME server</a> to do the tedious work.
This tool might be what I need.</dd>
</dl>
<h2 id="entertainment">Entertainment</h2>
<dl>
<dt><a href="https://play.elevatorsaga.com/">Elevator Saga</a></dt>
<dd>Fun game where you program an elevator/set of elevators to meet certain criteria.</dd>
<dt><a href="https://www.movieofthenight.com/">Movie of the Night</a></dt>
<dd>While officially a <q>movie/series recommendation engine</q> I use
this site regularly to check if I can stream a movie or series in my country and
if so, on which service it is available.</dd>
<dt><a href="https://osmc.tv/">OSMC</a></dt>
<dd>An interesting looking open source media center, which you can run on a
Raspberry Pi or on their devices, like the <a href="https://osmc.tv/vero/">Vero 4K+</a></dd>
<dt><a href="https://www.amazon.com/Lazarus-Heist-Hollywood-Finance-Inside/dp/024155425X">The Lazarus Heist: From Hollywood to High Finance: Inside North Korea&rsquo;s Global Cyber War</a></dt>
<dd>A book about the
<a href="https://en.wikipedia.org/wiki/Lazarus_Group">Lazarus Group</a>, tipped in
<a href="https://darknetdiaries.com/transcript/119/">Darknet Diaries episode 119</a>.</dd>
<dt><a href="https://scottjucha.com/silverships.html">The Silver Ships Series</a></dt>
<dd>A book series by Scott Jucha, tipped in the
<a href="https://twit.tv/shows/security-now">Security Now podcast</a>. (The series is
mentioned in <a href="https://www.grc.com/sn/sn-887.htm">episode 887</a> for the first
time.) I&rsquo;ve finished the first book and loved reading it!</dd>
<dt><a href="https://nostarch.com/open-circuits">Open Circuits</a></dt>
<dd>A lovely book about electronic components with beautiful pictures.</dd>
</dl>
<h2 id="miscellaneous">Miscellaneous</h2>
<dl>
<dt><a href="https://pfauth.com/intentioneel-leven/persoonlijk-manifest/">Schrijf een persoonlijk manifest voor richting in je werk en leven</a> (Dutch)</dt>
<dd>I&rsquo;m not sure I&rsquo;ll ever write such a personal manifest, but just reading this
article gave me enough food for thought to make some decisions.</dd>
<dt><a href="https://www.lifewire.com/use-file-history-in-windows-10-3891070">How to Use File History in Windows 10</a></dt>
<dd>Useful article for people that want to backup and restore files on
Windows machines.</dd>
<dt><a href="https://restic.net/">Restic</a></dt>
<dd>I&rsquo;m in the process of testing this backup tool to see if I want to switch over
to restic from my current <code>rsync</code> based script to backup my Linux machines to an
external disk. If I go that route, I&rsquo;ll also have to have a look at
<a href="https://github.com/binarybucks/restic-tools">restic-tools</a>.</dd>
<dt><a href="https://shen.hong.io/reproducible-pdfa-compliant-latex/">Creating Fully Reproducible, PDF/A Compliant Documents in LaTeX</a></dt>
<dd>When I was perparing my CV and a cover letter, I wanted the resulting PDF to
be more accessible. This article gave me useful instructions on how to achieve
that.</dd>
<dt><a href="https://thepihut.com/blogs/raspberry-pi-tutorials/using-neopixels-with-the-raspberry-pi">Using Neopixels with the Raspberry Pi</a></dt>
<dd>I&rsquo;m toying with the idea of upgrading my home office with some LED
strips. Perhaps I&rsquo;ll use Neopixels and a Raspberry Pi (or similar board) to do
this.</dd>
<dt><a href="https://jvns.ca/blog/things-your-manager-might-not-know/">Things your manager might not know</a></dt>
<dd>An article about how you can help your manager (help you).</dd>
</dl>]]></content>
  </entry>
  <entry>
    <title type="html"><![CDATA[Pulling Docker images via a SOCKS5 proxy]]></title>
    <link rel="alternate" href="https://markvanlent.dev/2022/05/10/pulling-docker-images-via-a-socks5-proxy/" type="text/html" />
    <id>https://markvanlent.dev/2022/05/10/pulling-docker-images-via-a-socks5-proxy/</id>
    <author>
      <name>map[name:Mark van Lent uri:https://markvanlent.dev/about/]</name>
    </author>
    <category term="docker" />
    <category term="proxy" />
    <category term="ssh" />
    
    <updated>2022-05-10T20:14:56Z</updated>
    <published>2022-05-10T00:00:00Z</published>
    <content type="html"><![CDATA[<p>This post describes how you can work around a firewall to pull Docker images
from a server that you do not have direct access to, using a SOCKS5 proxy.</p>
<h2 id="why-would-you-want-to-do-this">Why would you want to do this?</h2>
<p>Let us assume you are in a corporate environment. And let us further assume that
you want to pull Docker images from a registry that you do not have direct
access to from that corporate environment, for instance because the registry is
running on a non-standard port and is thus blocked by a firewall.</p>
<p><img src="/images/blocked_connection.svg" alt="The direct connection between the laptop and server is not allowed"></p>
<p>So for instance the following does not work for you:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">docker pull registry.example.com:5678/image
</span></span></code></pre></div><p>Now let&rsquo;s make a few more assumptions:</p>
<ul>
<li>You have got SSH access to a machine outside of the corporate environment.</li>
<li>You are not violating any policy by bypassing the firewall, or have permission
to do so.</li>
</ul>
<h2 id="workaround">Workaround</h2>
<p>To work around the issue, you can do the following:</p>
<ul>
<li>Connect to a machine over SSH</li>
<li>Tunnel your <code>docker pull</code> command via that SSH connection.</li>
</ul>
<p><img src="/images/allowed_connection.svg" alt="Using a proxy to tunnel your traffic"></p>
<h3 id="setting-up-the-socks5-proxy-connection">Setting up the SOCKS5 proxy connection</h3>
<p>You are going to use the <code>-D</code> option in your SSH command. This allocates a
socket listing on a port. Connections made to this port are forwarded over the
(secure) channel.</p>
<p>For example, if you can use host <code>172.31.10.5</code> as a proxy, the command would look
like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plain" data-lang="plain"><span class="line"><span class="cl">ssh -D 8080 172.31.10.5
</span></span></code></pre></div><p>Every connection to port <code>8080</code> on <code>localhost</code> is proxied via host <code>172.31.10.5</code>.</p>
<h3 id="configure-docker">Configure Docker</h3>
<p>To make Docker use the proxy, you will have to configure <code>dockerd</code>. One way to do
this is to create the file <code>/etc/systemd/system/docker.service.d/proxy.conf</code>
with the following content:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="cl"><span class="k">[Service]</span>
</span></span><span class="line"><span class="cl"><span class="na">Environment</span><span class="o">=</span><span class="s">&#34;HTTP_PROXY=socks5://127.0.0.1:8080&#34;</span>
</span></span><span class="line"><span class="cl"><span class="na">Environment</span><span class="o">=</span><span class="s">&#34;HTTPS_PROXY=socks5://127.0.0.1:8080&#34;</span>
</span></span></code></pre></div><p>(You most likely do not even need the <code>HTTP_PROXY</code> line, but it also doesn&rsquo;t
hurt. ;-) )</p>
<p>Once this file is in place, you need to restart the Docker service:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">systemctl daemon-reload
</span></span><span class="line"><span class="cl">systemctl restart docker
</span></span></code></pre></div><p>When you run the following command again, the traffic is tunneled via your proxy.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">docker pull registry.example.com:5678/image
</span></span></code></pre></div><p>Voilà, the firewall is bypassed and you can now pull your Docker image.</p>]]></content>
  </entry>
  <entry>
    <title type="html"><![CDATA[Continuous deployment for this website using GitLab CI/CD, SSH, Docker and systemd]]></title>
    <link rel="alternate" href="https://markvanlent.dev/2021/03/02/continuous-deployment-for-this-website-using-gitlab-ci-cd-ssh-docker-and-systemd/" type="text/html" />
    <id>https://markvanlent.dev/2021/03/02/continuous-deployment-for-this-website-using-gitlab-ci-cd-ssh-docker-and-systemd/</id>
    <author>
      <name>map[name:Mark van Lent uri:https://markvanlent.dev/about/]</name>
    </author>
    <category term="blog" />
    <category term="docker" />
    <category term="devops" />
    <category term="gitlab" />
    <category term="ssh" />
    <category term="systemd" />
    
    <updated>2021-11-27T22:20:54Z</updated>
    <published>2021-03-02T00:00:00Z</published>
    <content type="html"><![CDATA[<p>Almost two years ago I <a href="/2019/04/10/new-blog-backend/#the-future">wrote</a> that
ideally I would not have to log in to my VPS to update this website. Well, that
moment has finally arrived.</p>
<p>A couple of weeks ago I decided to pursue <a href="https://en.wikipedia.org/wiki/Continuous_deployment">continuous
deployment</a> for this site.
Not because it is such a hassle to deploy a new version myself and also not
because it is needed that often, but because I wanted to explore the concept.</p>
<p>To summarize the most relevant parts of what I wrote in 2019:</p>
<ul>
<li>The source code for this blog is hosted on <a href="https://gitlab.com/markvl/blog">GitLab</a>.</li>
<li>Whenever I push a commit, GitLab CI/CD builds a new Docker image.</li>
<li>Once the build is done I SSH into the VPS this site is hosted from, pull the
new image and use it to run this site.</li>
</ul>
<p>That last, manual step is now automated. The hardest part was figuring out a
method I was happy with; that is: not putting the keys to the kingdom in GitLab.
Not that I distrust GitLab, but if someone would get access to my GitLab
account, I would not want them to <em>also</em> have unlimited access to the VPS.</p>
<p>The solution I ended up with consists of three parts:</p>
<ul>
<li>Configuration on GitLab to trigger a deployment.</li>
<li>A user on the VPS so the GitLab job can log into the VPS.</li>
<li>A monitoring service on the VPS to redeploy when a trigger is detected.</li>
</ul>
<h2 id="gitlab-configuration">GitLab configuration</h2>
<p><em>This part is basically a summary of what David Négrier wrote in his article
<a href="https://thecodingmachine.io/continuous-delivery-on-a-dedicated-server">Continuous delivery with GitLab, Docker and Traefik on a dedicated server</a>.
If you want a more detailed explanation, I can highly recommend reading his
article.)</em></p>
<p>Before we can get into the job that I added to my pipeline, we need to prepare
some things. Starting with adding a couple of
<a href="https://docs.gitlab.com/ee/ci/variables/index.html">GitLab CI/CD variables</a>:</p>
<ul>
<li><code>SSH_HOST</code> and <code>SSH_PORT</code>: the SSH client needs to know
how to connect to the VPS.</li>
<li><code>SSH_USER</code> and <code>SSH_PRIVATE_KEY</code>: the SSH client needs authentication
information.</li>
<li><code>SSH_KNOWN_HOST</code>: the public SSH key of the server (<code>SSH_HOST</code>) so we can add
it to the <code>known_hosts</code> file and prevent
<a href="https://en.wikipedia.org/wiki/Man-in-the-middle_attack">man-in-the-middle attacks</a>. I got this
value by running <code>ssh-keyscan &lt;hostname&gt;</code> on my laptop and pasting the output
in GitLab.</li>
</ul>
<p>In a moment we&rsquo;ll see how these variables get used.</p>
<p>Since I want to use the digest of the Docker image that is built in this
pipeline, I&rsquo;ve added an artifact to store the digest so we can access it later
on:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yml" data-lang="yml"><span class="line"><span class="cl"><span class="nt">build_image</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">script</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="l">...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">docker image ls --filter &#34;label=org.label-schema.vcs-url=https://gitlab.com/markvl/blog&#34; --filter &#34;label=org.label-schema.vcs-ref=$CI_COMMIT_SHA&#34; --format &#34;{{.Digest}}&#34; &gt; image-sha.txt</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">artifacts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">image-sha.txt</span><span class="w">
</span></span></span></code></pre></div><p>(The filters are probably not really necessary, but just in case there are
multiple images present, I want to be reasonably sure that I&rsquo;ve picked the right
one.)</p>
<p>Now finally the job that triggers the deployment:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yml" data-lang="yml"><span class="line"><span class="cl"><span class="nt">deploy_image</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">stage</span><span class="p">:</span><span class="w"> </span><span class="l">deploy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">only</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">refs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">main</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">services</span><span class="p">:</span><span class="w"> </span><span class="p">[]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">alpine:latest</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">script</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">apk add --no-cache openssh</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">mkdir ~/.ssh</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">echo &#34;$SSH_KNOWN_HOSTS&#34; &gt;&gt; ~/.ssh/known_hosts</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">chmod 644 ~/.ssh/known_hosts</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">echo &#34;$SSH_PRIVATE_KEY&#34; &gt; ~/.ssh/private_key</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">chmod 600 ~/.ssh/private_key</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># add ssh key stored in SSH_PRIVATE_KEY variable to the agent store</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">eval $(ssh-agent -s)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">ssh-add ~/.ssh/private_key</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">ssh -p $SSH_PORT $SSH_USER@$SSH_HOST $(cat image-sha.txt)</span><span class="w">
</span></span></span></code></pre></div><p>Most of the code is just to get SSH working. All the magic happens in the last
line. Note that in contrast to David&rsquo;s article I don&rsquo;t actually execute commands
on my VPS, instead I only send one string.</p>
<p>(For the full <code>.gitlab-ci.yml</code> file see <a href="https://gitlab.com/markvl/blog/-/blob/32f876382900b6d4f25af988e7efdde3e17e4b52/.gitlab-ci.yml">the GitLab repo for this site</a>.)</p>
<p>Now every time the pipeline is run on the <code>main</code> branch, the digest of the
freshly built Docker image is sent to my VPS.</p>
<h2 id="vps-ssh-configuration">VPS SSH configuration</h2>
<p>On the VPS we need to make sure that the GitLab job can SSH into the machine.</p>
<p>The first step is to create a user to be used by GitLab (the <code>SSH_USER</code> variable
I mentioned above). Next we need to make sure that the <code>SSH_PRIVATE_KEY</code> stored
in GitLab can be used to log in. To make this possible <em>and</em> to mitigate the
risks of the SSH key in GitLab getting abused, I have added the following
content to the file <code>~/.ssh/authorized_keys</code> of the new user:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">command=&#34;/home/&lt;username&gt;/.ssh/ssh_commands.sh&#34;,no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding &lt;public key&gt; &lt;comment&gt;
</span></span></code></pre></div><p>Using the <code>command</code> option is an idea I got from
<a href="https://serverfault.com/a/803873/25920">a ServerFault answer</a>
and Mauricio Tavares&rsquo; article
<a href="https://unixwars.blogspot.com/2014/12/getting-sshoriginalcommand.html">Getting the SSH_ORIGINAL_COMMAND</a>.
In my case the <code>ssh_commands.sh</code> file stores the original command (in my case
the digest) in a file called <code>deployment.raw</code>.</p>
<h2 id="vps-monitoring-service">VPS monitoring service</h2>
<p>To actually deploy the new image, we need just one more piece in this puzzle: a
script to pull and use the Docker image.</p>
<p>I&rsquo;ve opted for a systemd unit to monitor for the existence of the
<code>deployment.raw</code> file by adding the file
<code>/etc/systemd/system/blog-deployment.path</code> (note the &ldquo;<code>.path</code>&rdquo; at the end of the
filename):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="cl"><span class="k">[Unit]</span>
</span></span><span class="line"><span class="cl"><span class="na">Description</span><span class="o">=</span><span class="s">Blog deployment path monitor</span>
</span></span><span class="line"><span class="cl"><span class="na">Wants</span><span class="o">=</span><span class="s">blog.service</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">[Path]</span>
</span></span><span class="line"><span class="cl"><span class="na">PathExists</span><span class="o">=</span><span class="s">/&lt;path&gt;/&lt;to&gt;/deployment.raw</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">[Install]</span>
</span></span><span class="line"><span class="cl"><span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span>
</span></span></code></pre></div><p>This systemd unit configuration file is accompanied by the following service
file (<code>/etc/systemd/system/blog-deployment.service</code>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="cl"><span class="k">[Unit]</span>
</span></span><span class="line"><span class="cl"><span class="na">Description</span><span class="o">=</span><span class="s">Blog deployment service</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">[Service]</span>
</span></span><span class="line"><span class="cl"><span class="na">Type</span><span class="o">=</span><span class="s">oneshot</span>
</span></span><span class="line"><span class="cl"><span class="na">ExecStart</span><span class="o">=</span><span class="s">/usr/local/bin/deploy_blog.sh</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">[Install]</span>
</span></span><span class="line"><span class="cl"><span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span>
</span></span></code></pre></div><p>In the <code>deploy_blog.sh</code> script I do things like reading the <code>deployment.raw</code>
file, checking its content, downloading the new Docker image, checking it and
restarting this website with the new image.</p>
<h2 id="summary">Summary</h2>
<p>To recap my continuous deployment solution:</p>
<ul>
<li>I push a commit to the <code>main</code> branch of the repo of this site.</li>
<li>GitLab CI/CD builds a new Docker image and sends its digest to my server.</li>
<li>My server watches for the existence of the digest file and uses it as a
trigger to deploy the new version of this website.</li>
</ul>
<p>And now that I&rsquo;ve written this down, I&rsquo;m going to commit this article to Git,
push it to GitLab and then sit back and wait (im)patiently for my website to
update itself. ;-)</p>]]></content>
  </entry>
  <entry>
    <title type="html"><![CDATA[Devopsdays Amsterdam 2018: reflection]]></title>
    <link rel="alternate" href="https://markvanlent.dev/2018/07/04/devopsdays-amsterdam-2018-reflection/" type="text/html" />
    <id>https://markvanlent.dev/2018/07/04/devopsdays-amsterdam-2018-reflection/</id>
    <author>
      <name>map[name:Mark van Lent uri:https://markvanlent.dev/about/]</name>
    </author>
    <category term="conference" />
    <category term="devops" />
    <category term="docker" />
    <category term="elastic" />
    <category term="go" />
    <category term="kubernetes" />
    <category term="opinion" />
    
    <updated>2021-10-26T18:57:58Z</updated>
    <published>2018-07-04T00:00:00Z</published>
    <content type="html"><![CDATA[<p>About a week has past since devopsdays Amsterdam. Time to write down
some of my thoughts.</p>
<h2 id="the-conference">The conference</h2>
<p>This has been the third time I went to devopsdays Amsterdam. And I
love this conference!</p>
<p>Some of the reasons:</p>
<ul>
<li>The organizers manage to get great speakers with interesting talks
on stage each year.</li>
<li><a href="https://dezwijger.nl/">Pakhuis de Zwijger</a> is a great location.</li>
<li>Excellent Wi-Fi.</li>
<li>Great atmosphere.</li>
<li>Good food.</li>
</ul>
<h2 id="the-workshops">The workshops</h2>
<h3 id="go">Go</h3>
<p>I had heard about <a href="https://golang.org/">Go</a>, some of my co-workers
have some experience with it, but I never wrote anything in the
language. I was curious about it though.</p>
<p>The <a href="/2018/06/27/devopsdays-amsterdam-2018-workshops/#go-for-ops-----michael-hausenblas-red-hat">workshop from Michael
Hausenblas</a>
was a nice intro. Based on what he told and showed us I cannot say
that I expect that Go will replace Bash and Python for me. However, I
will make some time to actually write some code myself to get a better
feel for it.</p>
<h3 id="monitoring-with-elastic">Monitoring with Elastic</h3>
<p>We are already using the <a href="https://www.elastic.co/products/">Elastic Stack</a> in
some places at work, but I have not used it for monitoring purposes. (I
gravitate towards <a href="https://prometheus.io/">Prometheus</a> combined with
<a href="https://github.com/prometheus/alertmanager">Alertmanager</a> for alerting and
<a href="https://grafana.com/">Grafana</a> for dashboards with graphs.) However, <a href="/2018/06/27/devopsdays-amsterdam-2018-workshops/#monitor-your-microservices-----logs-metrics-pings-and-traces-----philipp-krenn-elastic">Philipp
Krenn showed
us</a>
that you can also do very interesting things with
<a href="https://www.elastic.co/kibana/">Kibana</a> in the monitoring and debugging realm.
Especially since you can correlate metrics with logs in the same tool.</p>
<h3 id="kubernetes">Kubernetes</h3>
<p>I could say that <a href="/2018/06/27/devopsdays-amsterdam-2018-workshops/#kubernetes-101-----bridget-kromhout-microsoft">Bridget Kromhout&rsquo;s Kubernetes
workshop</a>
was a nice refresher of what I had learned in the <a href="/2017/06/28/devopsdays-amsterdam-2017-day-zero-workshops/#introduction-to-kubernetes-----andy-repton-schuberg-philis">Kubernetes workshop
last
year</a>
but, to be honest, that would be a lie. I am glad I took this
workshop.</p>
<p>It was a good workshop with lots of hands-on tasks. But it went a bit
too fast to make it stick. I would have to spend more time on a
Kubernetes cluster to really understand everything and get fluent with
it. Luckily there is lots of information on
<a href="https://container.training/">container.training</a> (including the
sheets of this workshop) and there are plenty of cloud providers where
you can get a Kubernetes cluster without having to create or maintain
it yourself.</p>
<h2 id="the-talks">The talks</h2>
<p>The talk that resonated most with me this year was the one from <a href="/2018/06/29/devopsdays-amsterdam-2018-day-two/#that-product-team-really-brought-that-room-together-----harold-waldo-grunenwald-datadog">Waldo
Grunenwald about product
teams</a>.
Perhaps because (in my opinion) this is something that could be better
in my job. Product management, development and operations are three
different teams with different managers. Then again, I currently try
to be the &ldquo;ops guy&rdquo; in our development team so that&rsquo;s also DevOps, right? :)</p>
<p>The other most memorable talks for me were:</p>
<ul>
<li><a href="/2018/06/28/devopsdays-amsterdam-2018-day-one/#cloud-containers-kubernetes-----bridget-kromhout-microsoft">Bridget Kromhout&rsquo;s keynote: Cloud, containers, k8s </a></li>
<li><a href="/2018/06/28/devopsdays-amsterdam-2018-day-one/#service-mesh-for-microservices-----armon-dadgar-hashicorp">Armon Dadgar on service meshes</a></li>
<li><a href="/2018/06/28/devopsdays-amsterdam-2018-day-one/#going-dutch-observaties-over-nederlandse-cultuur--devops-----jason-yee-datadog">Jason Yee relating Dutch peculiarities to DevOps</a></li>
<li><a href="/2018/06/29/devopsdays-amsterdam-2018-day-two/#monitoring-the-dynamic-nature-of-cloud-computing-----lee-atchison-new-relic">Lee Atchison about monitoring in a dynamic (cloud) environment</a></li>
</ul>
<h2 id="miscellaneous">Miscellaneous</h2>
<p>I have been using <a href="https://www.gnu.org/software/emacs/">Emacs</a> for
quite a while. I was a <a href="https://www.vim.org/">Vim</a> user in the past,
but switched somewhere between 2007 and 2009. (The first time I wrote
about Emacs here was in
<a href="/2009/05/03/using-git-when-developing-plone-applications/">2009</a>.)</p>
<p>I have tried <a href="https://www.jetbrains.com/pycharm/">PyCharm</a> a couple of
times and it is a really nice editor with very useful features. It
just never stuck with me and I always went back to Emacs after a
while.</p>
<p>During the conference I used <a href="https://code.visualstudio.com/">Visual Studio Code</a>
to write my notes. And I have to say I quite liked it. I intend to also give it
a go at work. Who knows, I might even switch&hellip;</p>]]></content>
  </entry>
  <entry>
    <title type="html"><![CDATA[Devopsdays Amsterdam 2018: workshops]]></title>
    <link rel="alternate" href="https://markvanlent.dev/2018/06/27/devopsdays-amsterdam-2018-workshops/" type="text/html" />
    <id>https://markvanlent.dev/2018/06/27/devopsdays-amsterdam-2018-workshops/</id>
    <author>
      <name>map[name:Mark van Lent uri:https://markvanlent.dev/about/]</name>
    </author>
    <category term="conference" />
    <category term="devops" />
    <category term="docker" />
    <category term="elastic" />
    <category term="go" />
    <category term="kubernetes" />
    <category term="monitoring" />
    
    <updated>2021-10-26T18:57:58Z</updated>
    <published>2018-06-27T00:00:00Z</published>
    <content type="html"><![CDATA[<p>Just like the previous couple of years, devopsdays Amsterdam started
off with a day of workshops. This year I attended workshops about Go,
monitoring microservices and Kubernetes.</p>
<h2 id="go-for-ops--michael-hausenblas-red-hat">Go for Ops &mdash; Michael Hausenblas (Red Hat)</h2>
<p>Michael walked us through the features of the Go language by giving numerous
examples. This is a workshop that usually takes a full day so we were in for a
nice ride.</p>
<p><img src="/images/devopsdays2018_michael_hausenblas.jpg" alt="Michael Hausenblas about the language features of Go"></p>
<p>One thing he mentioned that he liked about the language is that there is (almost) no
magic involved.</p>
<p>Some things that stood out to me, Mark, (as someone who writes Python most of
the time and does not know much about Go):</p>
<ul>
<li>There are no objects in Go; they are &ldquo;structs&rdquo; and methods (functions bound to
a struct) (Note from Mark: Steve Francia wrote &ldquo;<a href="https://spf13.com/post/is-go-object-oriented/">Is Go an Object Oriented
language?</a>&rdquo; which seems like a
useful article).</li>
<li>You need to create a file first before you can write to it. (In Python you
open a file for writing and it is created if needed).</li>
<li>To format dates you have to use a special date in the formatter: Jan 2,
15:04:05, 2006 (which is basically 1, 2, 3, 4, 5, 6).</li>
<li>The standard library is very comprehensive. (This is actually something Go has
in common with Python.)</li>
<li>Code formatting is enforced via <code>gofmt</code>.</li>
</ul>
<p>Common pattern to handle errors:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="w">  </span><span class="nx">mail</span><span class="p">,</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nf">mailof</span><span class="p">(</span><span class="nx">uid</span><span class="p">,</span><span class="w"> </span><span class="nx">aproject</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">if</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="kc">nil</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="o">...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nx">os</span><span class="p">.</span><span class="nf">Exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><a href="https://go-talks.appspot.com/github.com/mhausenblas/go4ops/main.slide#27">Slide 27</a>:
if you feed the <code>printf</code> function a different type, e.g. a string, it will not
even compile. (Mark: this is something I&rsquo;m not used to, coming from Python.)</p>
<p>To expose things like functions (make them available to other packages): start
the name with an uppercase letter. Functions starting with a lowercase letter
are internal/private to the package. If you try to access an internal
function, you get a nice error message (again: at compile time).</p>
<p><a href="https://go-talks.appspot.com/github.com/mhausenblas/go4ops/main.slide#31">Slide 31</a>: &ldquo;<code>log.Fatalf()</code>&rdquo;
triggers the <code>os.Exit(1)</code> you can see when you run this example.</p>
<p>You can add a call to the <code>defer</code> function at the end of a scope (e.g. &ldquo;<code>defer f.Close()</code>&rdquo;
in <a href="https://go-talks.appspot.com/github.com/mhausenblas/go4ops/main.slide#33">slide 33</a>).
Since the Go runtime will execute this always (even if there was an error), you can
use this e.g. as a cleanup of an open file. You can have as many <code>defer</code>s as you
like; they will be executed in reverse order.</p>
<p>Starting with writing tests is quite simple: create file with <code>&lt;module name&gt;_test.go</code>.
The function name of the test is irrelevant as long as it starts with &ldquo;<code>Test</code>&rdquo;.
Run the tests with &ldquo;<code>go test</code>&rdquo; (plus options, if you like). Go offers test
coverage information. Tip: use a nice editor/IDE and integrate running the tests
and code coverage there.</p>
<p>As you can see on <a href="https://go-talks.appspot.com/github.com/mhausenblas/go4ops/main.slide#39">slide 39</a> it
is possible to add encodings to your struct to be able to, for instance, encode
and decode JSON. There are other encodings, see e.g.
<a href="https://pkg.go.dev/encoding#section-directories">https://pkg.go.dev/encoding#section-directories</a></p>
<p>Google, where Go was created, uses a monorepo. As a result they did not need
dependency management in Go. Use e.g. <a href="https://github.com/golang/dep">dep</a> to
help you out here. It looks like <a href="https://github.com/golang/go/wiki/vgo">vgo</a>
will be part of the language in the future.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>You can either trust upstream (and Github to be available) and not put your
dependencies in your repo, or chose not to and version control the code you
depend on yourself.</p>
<p>About running a Go application in a container: you can either pick an image with
debug tools (like <code>centos:7</code>), or pick a minimal image like <code>alpine</code> or
<code>scratch</code> as the basis of your image. You have to decide whether you want the
smallest image possible or want (some) tools included.</p>
<p>For Michael, Go replaced a lot of Bash and Python. However, Michael is not
convinced that Go is a good fit to write a complete web application in, for
instance. But decide for yourself. On <a href="https://go-talks.appspot.com/github.com/mhausenblas/go4ops/main.slide#56">slide 56</a>
there are a couple of links to some pages with criticism.</p>
<p>As already stated, Go has an extensive standard library. Michael advises to use
it. If it does not have or do what you want, your second best option is to use a
drop-in replacement. Only if that is not possible, search for a package with a
different API.</p>
<p>Useful resources:</p>
<ul>
<li>The sheets for this workshop can be found via <a href="https://go-talks.appspot.com/github.com/mhausenblas/go4ops/main.slide#1">https://go-talks.appspot.com/github.com/mhausenblas/go4ops/main.slide#1</a>.</li>
<li><a href="https://gobyexample.com/">Go by example</a> is a nice resource to learn about
the concepts in Go.</li>
<li><a href="https://goreportcard.com/">https://goreportcard.com/</a></li>
<li><a href="https://golang.org/doc/">https://golang.org/doc/</a></li>
</ul>
<h2 id="monitor-your-microservices--logs-metrics-pings-and-traces--philipp-krenn-elastic">Monitor Your Microservices &mdash; Logs, Metrics, Pings, And Traces &mdash; Philipp Krenn (Elastic)</h2>
<p>Distributed services make debugging &hellip; interesting.</p>
<p><img src="/images/devopsdays2018_philipp_krenn.jpg" alt="Philipp Krenn talking about microservices"></p>
<p>The code for this workshop, a highly monitored &ldquo;hello world&rdquo; app can be found on
<a href="https://github.com/xeraa/microservice-monitoring">Github</a>.</p>
<p>The server provided for the workshop is an Amazon Lightsail instance created
with Terraform and provisioned with Ansible. (The code for this deployment is
also included in the aforementioned repo.)</p>
<p>Notable changes in Kibana 6.3:</p>
<ul>
<li>It has tools to manage the Elasticsearch indices.</li>
<li>In visualizations the aggregation previously called &ldquo;calculation&rdquo; has been
renamed to &ldquo;math.&rdquo;</li>
</ul>
<p>Packetbeat is using libpcap, just like Wireshark. Philipp thinks the future of
Packetbeat is in tracking down DNS + TLS errors since you should encrypt the
data between your services (which means that Packetbeat can no longer extract
much information from the packets).</p>
<p>Previously you used Logstash to get the Nginx access logs into Elasticsearch.
Filebeat modules can help you there. Filebeat is just forwarding the data; the
parsing is done by Elasticsearch. Filebeat has processors to enrich events with
e.g. cloud and host metadata (quite cheaply actually since this information is
collected on startup of Filebeat and cached).</p>
<p>Auditbeat has the same type configuration as auditd.</p>
<p><a href="https://github.com/mheese/journalbeat">Journalbeat</a> (from a third party) can be
used for journald support. Philipp doesn&rsquo;t guarantee anything, but this is on
the list of the Elastic team and he hopes there will be official support for
journald.</p>
<p>You can have a rule to collect multiline messages, like stack traces, together in
one document by telling Filebeat that if a line start with e.g. a timestamp, it
is the start of a new line and if it starts with e.g. a space it is part of a
stack trace. You could also use structured logs (which is recommended if you
can).</p>
<p>As of version 6 you can tell beats to enable (and update) the related dashboards
in Kibana.</p>
<p>For alerting with the Elastic stack you need a commercial license.</p>
<p>The machine learning (also only available in the commercial X-Pack license)
takes three iterations to detect a pattern. For example the pattern of how
much traffic your application receives on a workday can be learned in three
days. For a weekday/weekend pattern, it would need three weeks.</p>
<p>Kibana also has support for APM (Application Performance Monitoring). There are
agents for e.g. Python and Node and a bunch of others (some in beta or alpha
stage, see <a href="https://www.elastic.co/guide/en/apm/agent/index.html">the docs</a>).</p>
<p>Elastic is working on Index Lifecycle Management (ILM) which will run as part of
the cluster. Philipp is not sure when it will be available though. For now use
<a href="https://github.com/elastic/curator">Curator</a>.</p>
<p>Elasticsearch already supports metrics aggregation (called &ldquo;rollups&rdquo;) via the
API. In a future version there will also be an graphical interface to configure
this.</p>
<p>Philipp compared his workshop to Lego. He showed us some configuration,
visualizations, etcetera but &ldquo;some assembly is required.&rdquo;</p>
<h2 id="kubernetes-101--bridget-kromhout-microsoft">Kubernetes 101 &mdash; Bridget Kromhout (Microsoft)</h2>
<p><em>This was a fast paced, highly interactive workshop about Kubernetes so I only
took a few notes. However, the slides have so much information on them, you can
follow the workshop perfectly fine without comments from me.</em></p>
<p>Resources:</p>
<ul>
<li>Slides: <a href="https://devopsdaysams2018.container.training/">https://devopsdaysams2018.container.training/</a></li>
<li>Git repo: <a href="https://github.com/jpetazzo/container.training">https://github.com/jpetazzo/container.training</a></li>
</ul>
<p>Warning: we have done stuff you should not do in production. :)</p>
<p><img src="/images/devopsdays2018_bridget_kromhout.jpg" alt="Bridget Kromhout during her Kubernetes 101 workshop with lots of containers"></p>
<blockquote>
<p>Kubernetes is highly unopinionated.</p></blockquote>
<p>By default Kubernetes uses one big, flat network. However, you can configure
Kubernetes so that customers cannot access each other.</p>
<p>In real life you would not host your own Docker registry in the production
environment. We do it in the workshop because it is easier than messing with
credentials to other registries.</p>
<p>Kubernetes has extensive role based access control support.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Update 2021-10-08: According to the <code>deb</code> repository: <q>Dep was an official
experiment to implement a package manager for Go. As of 2020, Dep is deprecated
and archived in favor of Go modules, which have had official support since Go
1.11. For more details, see <a href="https://golang.org/ref/mod">https://golang.org/ref/mod</a>.</q>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content>
  </entry>
  <entry>
    <title type="html"><![CDATA[DockerCon EU 2017: day two]]></title>
    <link rel="alternate" href="https://markvanlent.dev/2017/10/18/dockercon-eu-2017-day-two/" type="text/html" />
    <id>https://markvanlent.dev/2017/10/18/dockercon-eu-2017-day-two/</id>
    <author>
      <name>map[name:Mark van Lent uri:https://markvanlent.dev/about/]</name>
    </author>
    <category term="conference" />
    <category term="devops" />
    <category term="docker" />
    <category term="tools" />
    
    <updated>2021-10-08T19:02:07Z</updated>
    <published>2017-10-18T00:00:00Z</published>
    <content type="html"><![CDATA[<p>These are my notes of my second day at DockerCon.</p>
<p>Just as with
<a href="/2017/10/17/dockercon-eu-2017-day-one/">yesterday&rsquo;s notes</a>,
these are just notes and not summaries.</p>
<h2 id="general-session">General session</h2>
<p>The general session was mostly devoted to modernizing traditional
applications, saving costs and customer success stories.</p>
<p><img src="/images/dockerconeu17_gs2.jpg" alt="DockerCon Europe 2017 General Session day 2"></p>
<h2 id="tips-and-tricks-of-the-docker-captains--adrian-mouat-container-solutions">Tips and Tricks of the Docker Captains &mdash; Adrian Mouat (Container Solutions)</h2>
<p>Several small tips and tricks.</p>
<h3 id="daily-development">Daily development</h3>
<p>You can configure the output of the &ldquo;<code>docker ps</code>&rdquo; or &ldquo;<code>docker container ls</code>&rdquo; commands with the &ldquo;<code>--format</code>&rdquo; argument. You can also
put your preference for the formatting in your <code>~/.docker/config.json</code>
file under the <code>psFormat</code> property (see the documentation on
<a href="https://docs.docker.com/engine/reference/commandline/cli/#configuration-files">configuration files</a>. Warning,
this file also contains your passwords to Docker registries so <strong>do not</strong>
put it online.</p>
<p>Cleaning up:</p>
<ul>
<li>Remove dangling images: <code>docker image prune</code></li>
<li>Remove stopped containers: <code>docker container prune</code></li>
<li>Remove unused volumes: <code>docker volume prune</code></li>
<li>Remove unused networks: <code>docker network prune</code></li>
<li>Remove all of the above: <code>docker system prune</code></li>
</ul>
<h3 id="building-images">Building images</h3>
<p>The &ldquo;<code>.</code>&rdquo; at the end of a Docker <code>build</code> command means that the target
(the current directory in this case) is sent to the Docker Daemon as a
tarball. Use the <code>.dockerignore</code> file to exclude large directories.</p>
<p>Alpine is pretty small (5MB). Couple of gotchas though, like:</p>
<ul>
<li>uses <code>musl</code> instead of <code>glibc</code></li>
<li>uses its own package manager</li>
</ul>
<p>If you are looking for an alternative, have a look at the Debian Slim
images like <code>debian:stretch-slim</code>. They are (at the moment) 30MB or
smaller.</p>
<p>If you build static binaries, you can put the binary in the scratch
image. Since there is no operating system on top of the kernel, you
cannot use user names. You can use IDs, <code>USER 65534</code> maps to the the
&ldquo;nobody&rdquo; user.</p>
<p><img src="/images/./dockerconeu17_adrian_mouat_1.jpg" alt="Adrian Mouat showing a minimal image Dockerfile"></p>
<h3 id="container-lifecycle">Container lifecycle</h3>
<p>Do not require containers to start in sequence. Instead have a
container wait for a service it depends on (including backoff) and
include this in the application itself or in a startup script.</p>
<p>When Docker stops a container, it sends a <code>SIGTERM</code> signal, waits for
10 seconds and then hard kills the container with a <code>SIGKILL</code>. If the
latter happens, you cannot tidy up properly (e.g. close network
connections, write a final log entry, etc). So try to prevent this.</p>
<p><a href="https://github.com/krallin/tini">Tini</a>, used for signal forwarding,
is integrated in Docker now.</p>
<p>A benefit of healthchecks is that Swarm will only route to healthy
containers. Note that healthchecks are run <em>inside</em> the container
itself, not on the host. This might mean you will have to install more
software in your image (e.g. <code>curl</code>).</p>
<h3 id="security">Security</h3>
<p>To improve security, use a read-only file system by adding
<code>--read-only</code> to the <code>run</code> command. Use a
<a href="https://docs.docker.com/storage/tmpfs/">tmpfs mount</a> to
create writeable locations where applications can write e.g. pid
files. The data written to the tmpfs mounts is kept in memory and not
stored persistently on the host.</p>
<p><img src="/images/./dockerconeu17_adrian_mouat_2.jpg" alt="Adrian Mouat showing how to start a read-only Nginx container"></p>
<p>Users are not namespaced (by default). If an attacker breaks out of
the container via service running as root, the attacker is also root
on the host. So do not run as root! Create and set a <code>USER</code> in your
<code>Dockerfile</code> or use the <code>nobody</code> user.</p>
<p>To prevent using <code>sudo</code>, use <a href="https://github.com/tianon/gosu">gosu</a>
instead.</p>
<p>It&rsquo;s nearly always a bad idea to run Docker in Docker (issues with
file systems and caching, image stores). Instead, mount the Docker
socket with &ldquo;<code>-v /var/run/docker.sock:/var/run/docker.sock</code>&rdquo;. Be
aware: this is a security problem because there is less isolation
between the container and the host.</p>
<h2 id="alpine-linux-under-the-microscope--natanael-copa-docker">Alpine Linux under the microscope &mdash; Natanael Copa (Docker)</h2>
<p><a href="https://alpinelinux.org/">Alpine Linux</a> uses the MIT licensed musl
libc which has a clean, modern codebase and is lightweight. It&rsquo;s
small, so what is missing?</p>
<ul>
<li>Some GNU extensions</li>
<li>Lots of localization data</li>
<li>GNU bloat</li>
<li>Name Service Switch (NSS)</li>
<li>Network Services Library (libnsl)</li>
<li>80+ CVEs ;-)</li>
</ul>
<figure><img src="/images/dockerconeu17_natanael_copa.jpg"
    alt="Natanael Copa comparing the sizes of CentOS, Ubuntu and Alpine Linux Docker images"><figcaption>
      <p>Natanael Copa comparing the sizes of CentOS, Ubuntu and Alpine Linux Docker images</p>
    </figcaption>
</figure>

<p>Busybox is also part of Alpine Linux. It includes most of POSIX&rsquo;s
shells and utilities. It&rsquo;s pretty impressive how many tools are
squeezed into ~800KB.</p>
<p>Alpine created apk-tools because the traditional package managers were
not fast enough. It is faster than other package managers because it
is designed to read once and write once (compared to minimal 3 reads and 2
writes).</p>
<p>The <code>--no-cache</code> option was added to the package manager specifically
for Docker. It does not store cache information on disk. If you use
this flag, you do not need a cleanup step (in contrast to when you are
using <code>apt</code>).</p>
<p>With regards to security:</p>
<ul>
<li>Alpine uses secure defaults</li>
<li>Has a smaller attack surface</li>
<li>Uses more secure components (musl, libressl)</li>
<li>Has a hardened kernel (unofficial fork of grsecurity)</li>
</ul>
<p>When not to use Alpine? If you:</p>
<ul>
<li>Depend on precompiled (closed source) binaries</li>
<li>Need good localization</li>
<li>Want commercial support</li>
<li>Need glibc/GNU specific behaviour</li>
</ul>
<h2 id="practical-design-patterns-in-docker-networking--dan-finneran-docker">Practical design patterns in Docker networking &mdash; Dan Finneran (Docker)</h2>
<p>Several types of network drivers:</p>
<ul>
<li>
<p><strong>Null</strong>: you can use this to black hole your container.</p>
</li>
<li>
<p><strong>Host</strong>: simplest, come out of the box (use
<code>--net=host</code>). The container will connect its ports to the host.</p>
</li>
<li>
<p><strong>Bridge</strong>: no flags needed (the default), connect to the
internal bridge network. Containers can speak with each other, but
nothing can speak with them or the other way around.</p>
<p>Using the <code>-p</code> flag you can expose ports. Only expose services
that need to be exposed.</p>
</li>
<li>
<p>Swarm <strong>overlay</strong> networking: using VXLAN to create overlay network over
the underlying network. The network is encrypted by default.</p>
</li>
</ul>
<p><img src="/images/dockerconeu17_dan_finneran.jpg" alt="Dan Finneran"></p>
<p>A relatively new addition is the <strong>macvlan</strong> driver. It provides a
hardware address to each container. You&rsquo;ll want this if you need to
connect to a VLAN network or have to deal with IPAM. It requires
promiscuous mode.</p>
<blockquote>
<p>The macvlan driver essentially makes a Docker container a first
class citizen on the network.</p></blockquote>
<p>You can have a separate data and control plane in your network on
hosts with multiple NICs. This provides physical and logical
separation of traffic.</p>]]></content>
  </entry>
  <entry>
    <title type="html"><![CDATA[DockerCon EU 2017: day one]]></title>
    <link rel="alternate" href="https://markvanlent.dev/2017/10/17/dockercon-eu-2017-day-one/" type="text/html" />
    <id>https://markvanlent.dev/2017/10/17/dockercon-eu-2017-day-one/</id>
    <author>
      <name>map[name:Mark van Lent uri:https://markvanlent.dev/about/]</name>
    </author>
    <category term="conference" />
    <category term="devops" />
    <category term="docker" />
    <category term="security" />
    <category term="tools" />
    
    <updated>2021-10-08T19:02:07Z</updated>
    <published>2017-10-17T00:00:00Z</published>
    <content type="html"><![CDATA[<p>These are my notes of my first day at DockerCon.</p>
<p>Where I usually try to make summaries of conference talks I attend,
I&rsquo;ve only made notes for this conference. Most, if not all, talks have
been recorded as far as I know, so you should be able to watch them to
place the notes into context.</p>
<h2 id="general-session">General session</h2>
<p>In the context of &ldquo;MTA&rdquo; (modernize traditional applications) we saw a
demo of how easy it is to convert legacy applications with the Docker
application converter (dac). In the demo they showed how the tool can
generate a <code>Dockerfile</code> from a tarball of a Java application.</p>
<p>Not everyone is using Swarm. Unfortunately, using a different
orchestration tool means that you do not have the seamless integration
that Swarm has. However, the next version of Docker will have native
integration for both Swarm and Kubernetes.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p><img src="/images/dockerconeu17_native_kubernetes.jpg" alt="Kubernetes will have native support in Docker"></p>
<p>This way you can still use the same tools and workflow while being
able to choose between Swarm and Kubernetes. It also means you can
still use your <code>docker-compose.yml</code> files.</p>
<p><img src="/images/dockerconeu17_kubernetes_win_mac.jpg" alt="Kubernetes supported in Docker CE for Windows and Mac"></p>
<h2 id="hacked-run-time-security-for-containers--gianluca-borello-sysdig">Hacked! Run time security for containers &mdash; Gianluca Borello (Sysdig)</h2>
<p>A container is ephemeral and you don&rsquo;t need to protect it. You want
to protect the application/service instead.</p>
<p>SELinux is very powerful, but difficult to master. And thus people
often turn it off. To avoid users turning off or circumventing
security measures, one should reduce friction.</p>
<p>Forensics on containers is difficult, for instance because containers
may only live for a few minutes.</p>
<p>To build a security framework, you need to:</p>
<ul>
<li>observe, e.g. using Sysdig</li>
<li>understand the services, e.g. using Sysdig ServiceVision</li>
<li>detect bad behaviour, e.g. using <a href="https://sysdig.com/opensource/falco/">Sysdig Falco</a></li>
</ul>
<p><img src="/images/dockerconeu17_gianluca_borello.jpg" alt="Gianluca explaining what you need to build a security framework"></p>
<p>Gianluca demonstrated that Falco can detect e.g. network activity
performed by known binaries that are not supposed to send or receive
data over the network.</p>
<p>Sysdig maintains a ruleset for Falco. They update it weekly.  They are
also experimenting with automatic ruleset creation based on normal
behaviour of a container.</p>
<p><a href="https://sysdig.com/products/secure/">Sysdig Secure</a>: A new product to
provide run-time security for containers. You can define policies when
to trigger an event. You can also store Sysdig captures so that you
can see what happened <em>before</em> the event triggered.</p>
<p><a href="https://sysdig.com/blog/sysdig-inspect/">Sysdig Inspect</a> is a UI
around open source Sysdig tool to analyse Sysdig captures.</p>
<p>Sysdig has a low overhead (different from strace). It is meant to run
24/7 on production systems.</p>
<h2 id="docker-but-im-a-sysadmin--mike-coleman-docker">Docker?!?! But I&rsquo;m a SysAdmin &mdash; Mike Coleman (Docker)</h2>
<p>There was a contest to create the smallest Docker container that
printed out &ldquo;Hello world&rdquo;. The result was an image of only 64
kilobyte.</p>
<p>Security is about more than just isolation. For instance: where did
the images come from, are they up to date, how do you deal with
sensitive data and/or passwords?</p>
<p>Docker for AWS and Docker for Azure have integration for those platforms. They
offer more than installing standard Docker on an EC2 instance.</p>
<p><img src="/images/dockerconeu17_mike_coleman.jpg" alt="Mike Coleman about some of the considerations you have to make"></p>
<p>When you start integrating Docker into your environment, don&rsquo;t forget
to change your processes. Think about your backup and recovery
strategies, et cetera.</p>
<p>Mike encourages us to share our knowledge, and our missteps.</p>
<h2 id="creating-effective-images--abby-fuller-aws">Creating effective images &mdash; Abby Fuller (AWS)</h2>
<p>This is a talk about disk space.</p>
<p>Smaller images mean faster builds and deploys, but also a smaller
attack surface.</p>
<p>Some tips:</p>
<ul>
<li>Use shared base image where possible.</li>
<li>Limit the data written to the container layer.</li>
<li>Chain <code>RUN</code> statements.</li>
<li>Prevent cache misses at build for as long a possible.</li>
</ul>
<p>To prevent cache misses, put all static stuff first (labels, etc). You
really want to make sure that you use the cache as much as possible.</p>
<p>Read the documentation about the <a href="https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#leverage-build-cache">build cache</a>.</p>
<p>Pick your base images wisely. You can go for a minimal OS, but there
are also reasons for wanting a full base OS:</p>
<ul>
<li>Security (with a minimal image, you have to roll your own)</li>
<li>Compliance</li>
<li>Ease of deployment (a minimal image can mean extra work)</li>
</ul>
<p>(Note to self: have a look at the
<a href="https://docs.docker.com/engine/reference/builder/#onbuild">ONBUILD</a>
command.)</p>
<p>Some random notes:</p>
<ul>
<li>MSI installations (for Windows images) are not space efficient.</li>
<li>Coming up soon: run Linux containers &ldquo;as-is&rdquo; on Windows.</li>
<li>Switching users also adds another layer.</li>
<li>Where possible use two images: one to build an artefact, one to build an image.</li>
<li><code>scratch</code> is a special, empty Docker image. Use this to build your own base images.</li>
<li>Use a <a href="https://docs.docker.com/engine/reference/builder/#dockerignore-file">.dockerignore</a> file.</li>
</ul>
<p><img src="/images/dockerconeu17_abby_fuller.jpg" alt="Abby Fuller with tips for creating Docker images"></p>
<p>Cleaning up:</p>
<ul>
<li><code>docker image prune -a</code>: Remove dangling images</li>
<li><code>docker system prune -a</code>: Also remove untagged volumes, etc.</li>
<li>Make sure your orchestration platform is garbage collecting.</li>
</ul>
<p>Recap:</p>
<ul>
<li>Less layers is more</li>
<li>Choose or build your base wisely</li>
<li>Not all languages should build the same</li>
<li>Keep it simple, avoid extras (e.g. use <code>apt install --no-install-recommends</code>)</li>
<li>Tools are here to help</li>
</ul>
<h2 id="docker-500-going-fast-while-protecting-data--diogo-mónica-docker">Docker 500: Going fast while protecting data &ndash; Diogo Mónica (Docker)</h2>
<p>Security means <q>a state of being free from danger or threat.</q> But as
soon as we connect something to a network, we are exposing it to
threats. So it would be better to talk about &ldquo;safety&rdquo; since that is
about <q>being protected from or unlikely to cause danger, risk or
injury.</q></p>
<p><img src="/images/dockerconeu17_diogo_monica.jpg" alt="Diogo Mónica about going as fast as possible, safely"></p>
<p>The &ldquo;Docker 500&rdquo; from the title is a play on the <a href="https://en.wikipedia.org/wiki/Indianapolis_500">Indy 500</a></p>
<p>While there can be horrible accidents in the Indy 500, the drivers
often can just walk away from their car. We must architect our systems
to protect our data, just like the drivers are protected in the Indy
500.</p>
<p>In racing, two categories of measures are taken: before the crash and
after/during the crash.</p>
<p>If we translate the pre-crash measures to our world, we&rsquo;d get
something like this:</p>
<ul>
<li>Test: create a trusted, repeatable and <strong>adversarial</strong> CI/CD
pipeline. Unless you have adversarial tests, you are effectively
testing in production.</li>
<li>Design applications to segment portions of their infrastructure
(microsegmentation).</li>
<li>Practice worst case scenarios. For example: if a secret (key) is
leaked, can you revoke trust of that secret in under X time?</li>
<li>Reverse uptime! Have a maximum uptime for a server. Note that this
is also good for operations because the machines don&rsquo;t get a
chance to drift.</li>
</ul>
<p>Measures after/during a crash:</p>
<ul>
<li>Freeze the container (disk and memory state), solve the problem
and inspect the frozen container later.</li>
<li>Automatic scaling.</li>
<li>Sandbox by default (by putting it in Docker).  Make sure that data
cannot be operated outside your data center by so called
&ldquo;crypto-anchors&rdquo; (a term coined by Diogo) via e.g. a hardware
security module.</li>
<li>Role based access control + visibility</li>
<li>Have data about the crash. This is where the Docker security
ecosystem shines.</li>
</ul>
<p>Access to the data should not just be secure, but also safe. With careful
engineering you can be fast and safe at the same time.</p>
<p>Note that containerd can freeze memory and push to a registry. For
freezing the disk you can use <code>commit</code> and then push to a registry.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Update 2017-10-19: I now understand that Docker CE will only
include Kubernetes in Docker for Windows and Docker for Mac because
those platforms use LinuxKit. Docker (the company) is not clear on
how/when Kubernetes will be supported on Linux.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content>
  </entry>
  <entry>
    <title type="html"><![CDATA[DevOpsDays Amsterdam 2017: day zero (workshops)]]></title>
    <link rel="alternate" href="https://markvanlent.dev/2017/06/28/devopsdays-amsterdam-2017-day-zero-workshops/" type="text/html" />
    <id>https://markvanlent.dev/2017/06/28/devopsdays-amsterdam-2017-day-zero-workshops/</id>
    <author>
      <name>map[name:Mark van Lent uri:https://markvanlent.dev/about/]</name>
    </author>
    <category term="ansible" />
    <category term="conference" />
    <category term="devops" />
    <category term="docker" />
    <category term="kubernetes" />
    <category term="openshift" />
    <category term="tools" />
    
    <updated>2021-10-07T18:13:55Z</updated>
    <published>2017-06-28T00:00:00Z</published>
    <content type="html"><![CDATA[<p>Before the regular DevOpsDays kicked off, there was a day filled with workshops.</p>
<h2 id="before-we-got-started">Before we got started</h2>
<p>While I was on my way to Amsterdam, I was reading up on my RSS feeds
and ran across the most recent comic on
<a href="https://turnoff.us/">turnoff.us</a>. It was so appropriate that I decided
to copy it here:</p>
<figure><img src="/images/turnoff_us_devops_explained.png"
    alt="DevOps Explained"><figcaption>
      <p>DevOps is not a Role &mdash; taken from <a href="https://turnoff.us/geek/devops-explained/">turnoff.us</a> and scaled down a bit. License: <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/">CC BY-NC-SA 4.0</a></p>
    </figcaption>
</figure>

<h2 id="setup-your-own-ansibledocker-workshopraising-an-ansible-army--arnab-sinha-tata-consultancy-services">Setup your own Ansible/Docker Workshop/Raising an Ansible Army &mdash; Arnab Sinha (TATA Consultancy Services)</h2>
<p>Arnab wanted to be able to easily create lab environments for
trainings. This workshop not only discusses how the lab is setup but
also uses such a lab environment (in this case to provide an Ansible
training environment).</p>
<p>The nature of the setup of the lab he used today: each participant got
a control node and two managed nodes. Each node was in fact a Docker
container which was managed by Ansible.</p>
<p>The first part of the workshop was basically an introduction to
Ansible with topics like the history of Ansible and basic command line
usage. Arnab demonstrated how to use a custom inventory file, limiting
plays to a group or certain tasks (or skipping tasks) and how to syntax
check your playbook.</p>
<p>A few examples:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ ansible all -i <span class="s2">&#34;localhost,&#34;</span> -c <span class="nb">local</span> -m shell -a whoami
</span></span><span class="line"><span class="cl">$ ansible -i demo.ini all -m shell -a whoami -v
</span></span><span class="line"><span class="cl">$ ansible-playbook playbook.yml --syntax-check
</span></span></code></pre></div><p>Some best practices:</p>
<ul>
<li>Use the <code>.ini</code> extension for your inventory file.</li>
<li>Use a separate inventory file for each environment (develop, test,
production, etc).</li>
<li>Use tags so you can specify which tasks you want to run.  (Use
&ldquo;<code>ansible-playbook --list-tags playbook.yml</code>&rdquo; to show all available
tags.)</li>
</ul>
<p>In the category &ldquo;today I learned&rdquo;:</p>
<ul>
<li>Ansible has a pull mode (<code>ansible-pull</code>). Who knew? :-)</li>
<li>Ansible comes with documentation: <code>ansible-doc</code>.</li>
<li>Looping over sequences with <code>with_sequence</code> (see
<a href="https://docs.ansible.com/ansible/latest/user_guide/playbooks_loops.html#with-sequence">the docs</a>).</li>
<li>You can make a playbook executable by adding
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="cp">#!/usr/bin/env ansible-playbook
</span></span></span></code></pre></div>at the top (and using <code>chmod</code>).</li>
</ul>
<p>If you want to run your own lab, you can use Arnab&rsquo;s GitHub repo:
<a href="https://github.com/arnabsinha4u/ansible-traininglab">arnabsinha4u/ansible-traininglab</a>. Note
that this assumes a CentOS host.</p>
<p>In order to be able to log in to the &ldquo;master&rdquo; node (via <code>ssh ansiblelabuser1@localhost</code>) I had to enable <code>PasswordAuthentication</code>
in <code>/etc/ssh/sshd_config</code>. But since I had run the Ansible playbook
already, I was not allowed to change that file. I first had to run
this command:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ chattr -i /etc/ssh/sshd_config
</span></span></code></pre></div><p>Other GitHub repos from Arnab that you can use:</p>
<ul>
<li><a href="https://github.com/arnabsinha4u/docker-traininglab">arnabsinha4u/docker-traininglab</a></li>
<li><a href="https://github.com/arnabsinha4u/launchpad">arnabsinha4u/launchpad</a></li>
</ul>
<h2 id="introduction-to-kubernetes--andy-repton-schuberg-philis">Introduction To Kubernetes &mdash; Andy Repton (Schuberg Philis)</h2>
<p>Kubernetes is a container orchestration platform. It has a huge open
source backing and new features are being built quickly. It does one
thing (in an elegant way).</p>
<p>Kubernetes has three main components:</p>
<ul>
<li>Masters: the brains of the <em>cluster</em>. Consists of: Apiserver,
controller manager, scheduler.</li>
<li>Nodes: the brains of individual <em>nodes</em>. Consists of: kubelet,
kube proxy.</li>
<li>etcd: replicated key/value store; the state store and clustering
manager of kubernetes.</li>
</ul>
<p>When you look at it from a &lsquo;physical&rsquo; perspective, you have a
Kubernetes node and this node runs Docker, which in turn runs the
containers. Pods are a logical wrapper around containers; we don&rsquo;t care
about nodes.</p>
<p>Pods are mortal. What this means is that processes are expected to
die. But we do not care because Kubernetes ensures availability by
making sure that there are enough of them running.</p>
<p>During the workshop we used the following GitHub repo:
<a href="https://github.com/Seth-Karlo/intro-to-kubernetes-workshop">Seth-Karlo/intro-to-kubernetes-workshop</a>.</p>
<p>The pod you can create with the <code>pod/pod.yml</code> file can be used for a
toolbox to examine other pods.</p>
<p>More terminology: a <em>replica set</em> is basically a way of saying &ldquo;make
sure there are N copies of a pod.&rdquo; If you look at the specification of
a replica set, you can see that it contains a Pod spec.</p>
<p>Using the <code>readinessProbe</code> directive you can make sure that a
container does not receive traffic until it is actually ready. Note
that this is different from Docker&rsquo;s health check which is meant to
determine if a container is still working or should be killed.</p>
<p>With the replica set example in aforementioned repo, Kubernetes will
automatically start a pod again if it is killed. Even if you kill a
pod yourself&mdash;Kubernetes doesn&rsquo;t care <em>why</em> it has gone down.</p>
<p>If you edit a replica set (e.g. to update to a newer version of an image),
it has no immediate impact because the pod spec is nested. Deployments
can enforce changes are being executed though.</p>
<p>To get the whole configuration of a pod, including the default and not
just the stuff we specified, run:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ kubectl get pod &lt;podname&gt; -o yaml
</span></span></code></pre></div><p>Note that <code>volumeMounts</code> appear by default on every pod you create.</p>
<p>Secrets, although the name implies something different, are <strong>not</strong>
encrypted; all pods in the same namespace can access the secret and
decode it (base64). It is an easy way to put information in a pod, it
is not secure!</p>
<p>Services don&rsquo;t &ldquo;exist&rdquo; like containers do. A service is a purely
logical idea. A service exposes pods to other pods.</p>
<p>A service automatically gets a DNS entry: <code>&lt;service name&gt;.&lt;namespace name&gt;</code>. This means that from inside your containers, you can use DNS
to access other containers.</p>
<figure><img src="/images/devopsdays2017_kubernetes_workshop.jpg"
    alt="Andy presenting"><figcaption>
      <p>Andy with his fresh WordPress installation</p>
    </figcaption>
</figure>

<p>About scheduling:</p>
<ul>
<li>You can label nodes and then make sure that pods are scheduled on
nodes with a certain label.</li>
<li>Kubernetes will distribute pods across nodes as &rsquo;evenly&rsquo; as
possible.</li>
<li>Kubernetes will not auto reschedule pods when you add a new node.</li>
</ul>
<p>For this workshop we used <a href="https://github.com/kubernetes/kops">kops</a>
because it was easier.  At Schuberg Philis they actually use Terraform
to manage their cluster(s). Note that you can use a flag and then
<code>kops</code> will spit out Terraform code.</p>
<blockquote>
<p>If you are worried about your pods going down gracefully,
you are doing your pods wrong.</p></blockquote>
<p>If your application depends on long running processes: don&rsquo;t use
Kubernetes. Use the right tool for the right application.</p>
<p>Combine containers inside pods if latency matters, if they need to
share configuration files or if they need to connect via loopback
device.</p>
<p>Miscellaneous:</p>
<ul>
<li><a href="https://kompose.io/">Kompose</a>: a tool to convert Docker Compose
files to Kubernetes YAML files.</li>
<li>With
<a href="https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/">horizontal pod autoscaling</a>
you can automatically scale up/down the number of pods to handle load.</li>
<li>You can set limits on your pods so Kubernetes will kill it off
when it goes over the limits, e.g when it uses too much memory.</li>
</ul>
<p>Resources:</p>
<ul>
<li><a href="https://slides.com/andyrepton/introduction-to-kubernetes">Slides</a></li>
<li><a href="https://gist.github.com/Seth-Karlo/f6a88ca2e79dec42094abbacc850df5c">Command cheat sheet</a></li>
</ul>
<h2 id="hands-on-openshift-developer-workshop-in-azure--alessandro-vozza-microsoft--samuel-terburg-red-hat">Hands-On OpenShift Developer Workshop (In Azure) &mdash; Alessandro Vozza (Microsoft) &amp; Samuel Terburg (Red Hat)</h2>
<p>Why OpenShift: because developers need a platform to be able to deploy
their applications. OpenShift is a platform to run your containers at
scale. Meant for enterprise: not necessarily the latest features, but
focus on stability.</p>
<p>OpenShift was originally written in Ruby, but it has been rewritten in
Go and it is built upon Kubernetes. Openshift is always one release
(circa three months) behind on Kubernetes.</p>
<p>Everything you can deploy in Kubernetes, you can deploy on OpenShift.</p>
<figure><img src="/images/devopsdays2017_openshift_workshop.jpg"
    alt="Alessandro presenting"><figcaption>
      <p>Alessandro explaining what OpenShift is</p>
    </figcaption>
</figure>

<p><a href="https://github.com/openshift/origin">OpenShift Origin</a> is community
supported. If you want a commercially supported version, you have to
run on Red Hat Enterprise Linux
(RHEL). <a href="https://www.redhat.com/en/technologies/cloud-computing/openshift">Red Hat OpenShift</a> uses RHEL
images, where OpenShift Origin uses CentOS.</p>
<p>OpenShift Online runs on AWS, but you can for instance also run it on
bare metal if you want. But public clouds are a more natural fit for
cloud-native applications.</p>
<p>Pods are the orchestrated units in OpenShift. Containers in a pod can
talk to each other via localhost and local sockets. The security
boundary is extended from the container to the pod. Containers can see
each others processes and files. You only want to run one process in a
container though.</p>
<p>A service can be seen as a sort of load balancer to redirect traffic to
the right pods. Internally it is using <code>iptables</code>.</p>
<p>OpenShift provides its own Docker registry which you can use if you
want to.</p>
<p>OpenShift has solved the persistent storage problem before Kubernetes
did. You can use the native storage for your solution (e.g. EBS for
AWS). Note that block storage solutions require mounting/unmounting
and thus takes a little longer.</p>
<p>As with Kubernetes, there is no built-in autoscale for OpenShift.
<a href="https://access.redhat.com/products/red-hat-cloudforms">Red Hat CloudForms</a>
can monitor your cluster and do the scaling for you.</p>
<p>The routing layer is your entrypoint into the cluster. It&rsquo;s based on
HAProxy. Comparable with Kubernetes&rsquo; Ingress.</p>
<p>RHEL Atomic is a minimalistic OS designed to run Docker
containers. (It is similar to CoreOS, but Red Hat wanted to have its
own OS.) Everything you want to run has to run in a container. You can
install OpenShift on RHEL Atomic.</p>
<p>Fun fact: you can create resources in Azure with Ansible.</p>
<p>Unfortunately there were some problems with the Red Hat OpenShift
Azure Test Drive. As an alternative I used
<a href="https://docs.okd.io/3.11/minishift/index.html">minishift</a> to
run OpenShift on my laptop. With it, I could work on the workshop.</p>
<p>Further reading:</p>
<ul>
<li><a href="https://kubebyexample.com/">Kube by Example</a></li>
<li><a href="https://aka.ms/openshift">Azure Red Hat OpenShift</a></li>
<li><a href="https://github.com/ivanthelad/ansible-azure">ansible-azure repository</a></li>
<li><a href="https://docs.openshift.com/">OpenShift Documentation</a></li>
<li><a href="https://github.com/minishift/minishift">Minishift repository</a></li>
</ul>]]></content>
  </entry>
  <entry>
    <title type="html"><![CDATA[Setting up a temporary HTTP/HTTPS proxy via SSH]]></title>
    <link rel="alternate" href="https://markvanlent.dev/2013/09/19/setting-up-a-temporary-http-https-proxy-via-ssh/" type="text/html" />
    <id>https://markvanlent.dev/2013/09/19/setting-up-a-temporary-http-https-proxy-via-ssh/</id>
    <author>
      <name>map[name:Mark van Lent uri:https://markvanlent.dev/about/]</name>
    </author>
    <category term="development" />
    <category term="devops" />
    <category term="docker" />
    <category term="proxy" />
    <category term="ssh" />
    
    <updated>2022-05-10T20:18:22Z</updated>
    <published>2013-09-19T06:32:00Z</published>
    <content type="html"><![CDATA[<p>Currently I&rsquo;m working on a project where I have the staging
environment running on a virtual machine in a vlan. However, the
virtual machine cannot directly access the internet for security
reasons. This is inconvenient when I want to e.g. run a
<a href="https://www.buildout.org/en/latest/">buildout</a> to update the project.</p>
<p>A colleague told me to use
<a href="https://acme.com/software/micro_proxy/"><code>micro_proxy</code></a> and
<a href="https://acme.com/software/micro_inetd/"><code>micro_inetd</code></a> to proxy
traffic via my laptop. This is a description of how you can set things up.</p>
<div class="note update">
  <div class="note_header">
    Update (2019-07-15)<span class="hidden">:</span>
  </div>
  <div class="note_body">
    I am currently using a Docker to run a proxy on my
laptop. I have added a <a href="#docker">Docker</a> section where I describe my new
setup.
  </div>
</div>

<h2 id="ad-hoc">Ad hoc</h2>
<p>Obviously the first step is to install the relevant packages on the
local machine (Ubuntu in my case):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ sudo apt-get install micro-proxy micro-inetd
</span></span></code></pre></div><p>The next step is to run the proxy (again: on my laptop) and make sure
it accepts connections on port <code>3128</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ micro-inetd <span class="m">3128</span> /usr/sbin/micro_proxy
</span></span></code></pre></div><p>Then, when you SSH into the remote machine you will have to forward
the right port:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ ssh box.example.com -R 3128:localhost:3128
</span></span></code></pre></div><p>Whenever you want to access the internet, you&rsquo;ll have to use the proxy
listening on port <code>3128</code>. For instance to run <code>wget</code> and <code>buildout</code>,
you can set the following environment variables:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ <span class="nb">export</span> <span class="nv">http_proxy</span><span class="o">=</span>http://localhost:3128
</span></span><span class="line"><span class="cl">$ <span class="nb">export</span> <span class="nv">https_proxy</span><span class="o">=</span>http://localhost:3128
</span></span></code></pre></div><p>(Note that I&rsquo;m also proxying HTTPS traffic here, which is supported by
<code>micro_proxy</code>.)</p>
<p>The following <code>wget</code> command should now succeed:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ wget http://www.google.com/
</span></span></code></pre></div><h2 id="repeatable">Repeatable</h2>
<p>Assuming the ad hoc setup works, you may want to configure things so
things are a little bit easier the next time you want to use it. This
is what I did.</p>
<p>So I don&rsquo;t have to remember how to start the proxy, I added this line
to the <code>~/.bashrc</code> file on my local machine:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">alias</span> <span class="nv">start_proxy</span><span class="o">=</span><span class="s1">&#39;echo Running proxy on port 3128 &amp;&amp; micro-inetd 3128 /usr/sbin/micro_proxy&#39;</span>
</span></span></code></pre></div><p>The SSH command is also too much typing for my liking. So I added this
to my <code>~/.ssh/config</code> file:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="cl"><span class="na">Host box</span>
</span></span><span class="line"><span class="cl">    <span class="na">HostName box.example.com</span>
</span></span><span class="line"><span class="cl">    <span class="na">RemoteForward 3128 localhost:3128</span>
</span></span></code></pre></div><p>To make sure that the HTTP(S) proxy is used on the remote machine, I
added this to my <code>~/.bashrc</code> file on the remote:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">export</span> <span class="nv">http_proxy</span><span class="o">=</span>http://localhost:3128
</span></span><span class="line"><span class="cl"><span class="nb">export</span> <span class="nv">https_proxy</span><span class="o">=</span>http://localhost:3128
</span></span></code></pre></div><h2 id="end-result">End result</h2>
<p>So whenever I want to work on the staging environment, I open a
terminal and run:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ start_proxy
</span></span></code></pre></div><p>In another terminal I type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ ssh box
</span></span></code></pre></div><p>And I&rsquo;m good to go.</p>
<p>Now, there may be better solutions (especially if you want to
permanently setup a proxy), but for my purposes this works great.</p>
<h2 id="docker">Docker</h2>
<div class="note update">
  <div class="note_header">
    Update (2019-07-15)<span class="hidden">:</span>
  </div>
  <div class="note_body">
    I&rsquo;ve added this section to document an alternative to the
<code>micro_inetd</code>/<code>micro_proxy</code> combination.
  </div>
</div>

<p>When I originally wrote this article, I was not yet (or only just) using Docker.
But when I was setting up a new laptop a while ago, I wanted to run a proxy in a
Docker container.</p>
<p>As a result, I now run the following to start a proxy:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ docker run --name squid -d -p 3128:3128 datadog/squid
</span></span></code></pre></div><p>This way I don&rsquo;t have to install <code>micro_proxy</code> and <code>micro_inetd</code> on my machine.</p>]]></content>
  </entry>
</feed>
