<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/_stylesheets/atom_stylesheet.xsl"?>
<entry
        xmlns="http://www.w3.org/2005/Atom"
        xmlns:pktz="https://pktz.fr/schema/"
>
    <title>matrix-appservice-irc leaks messages through replies</title>
    <id>https://pktz.fr/matrix/security/2024-appservice-irc-message-leak-through-replies/</id>
    <published>2024-11-23</published>
    <updated>2024-11-23</updated>
    <author>
        <name>Val Lorentz</name>
        <uri>https://valentin-lorentz.fr/</uri>
    </author>
    <pktz:keywords>
        Matrix, matrix-appservice-irc, node-irc, NVT#1550329, CVE-2023-38700, GHSA-c7hh-3v6c-fj4q, CVE-2024-32000, GHSA-wm4w-7h2q-3pf7, CVE-2024-39691, GHSA-w9mh-5x8j-9754
    </pktz:keywords>
    <content type="xhtml" xml:lang="en">
        <div xmlns="http://www.w3.org/1999/xhtml">
            <pktz:toc />

            <section>
            <h2 id="summary">Summary</h2>

<p>
Several bugs in <a href="https://github.com/matrix-org/matrix-appservice-irc/">matrix-appservice-irc</a> allowed malicious users to leak parts of past messages in a room they joined after the message was sent.
</p>

            </section>

            <section>
            <h2 id="background">Background</h2>

<p>
By default, on IRC and in Matrix rooms acting as a <a href="https://matrix.org/docs/older/types-of-bridging/">"portal"</a> to IRC,
users are only allowed to see messages sent while they were in the channel/room (known as <code>"m.room.history_visibility": "joined"</code> on Matrix).
</p>

<p>
Matrix allows message events to reference the event id of another message that it is a reply to.
The reply may contain the text of the replied-to message (known as the "reply fallback"), but this text is ignored, and <a href="https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/3676-transitioning-away-from-reply-fallbacks.md">being deprecated</a>.
</p>

<p>
Because the IRC protocol traditionally does not support replies, this Matrix bridge often formats Matrix replies like this on the IRC side:
</p>

<pre>
<![CDATA[
  <alice> hello everyone, happy to be here
  [more messages]
  <bob> <alice> "hello everyone, happy..." <- welcome in this room
]]>
</pre>
            </section>

            <section>
            <h2 id="background">Threat model</h2>

<p>
The <code>hello everyone, happy...</code> part in this reply is susceptible to leaks, as it repeats a past message to the current audience of the IRC,
which may be larger than the original audience.
</p>

<p>
Security-wise, I do not consider this an issue that if Bob could read Alice's message and replies to it, parts of Alice's message are repeated.
I only see it an issue when Mallory, who cannot see the content of Alice's message, can craft an event that causes Alice's message to be repeated.
</p>

<p>
All of the vulnerabilities mentioned here require the attacker to know the event id of the message they want to leak.
While event ids are not necessarily easy to obtain, they are considered public, as they can be part of <a href="https://spec.matrix.org/v1.11/appendices/#matrixto-navigation">matrix.to</a> or <a href="https://spec.matrix.org/v1.11/appendices/#matrix-uri-scheme">matrix:</a> URL shared publicly, and are shared with any homeserver joining a room, even if said homeserver does not have access to the event's content.
</p>

<p>
These vulnerabilities only allow leaking a small part at the beginning of the message, up to 32 characters.
</p>
            </section>

            <section>
            <h2 id="bugs">The bugs</h2>
                <section>
                <h3 id="bug1">Bug #1: From other rooms</h3>

<p>
This first bug happens because matrix-appservice-irc did not perform any check on the event being replied-to before including it in the reply.
This allowed an attacker who knows the event ID to just send a reply to it in a room they control, and the replied-to message will be sent to the IRC channel.
</p>

<p>
Additionally, this allows leaking the nickname of the author of the replied-to message.
In the Matrix model, this is only an issue if the attacker is not in the room; as any server in the room can tell who the author (and timestamp) of an event id is, even if they do not have access to the content of the event.
</p>

<p>
<a href="https://github.com/matrix-org/matrix-appservice-irc/commit/8bbd2b69a16cbcbeffdd9b5c973fd89d61498d75">The fix</a> was obvious: check the replied-to message is in the same room as the reply.
</p>

<h4>Timeline:</h4>

<dl>
    <dt><time>2023-07-12</time></dt>
    <dd>Reported to Matrix.org's security team, and assigned ticket number NVT#1550329</dd>
    <dt><time>2023-07-20</time></dt>
    <dd>Given the lack of human reply within 24 hours (unexpected given
    <a href="https://web.archive.org/web/20230704143436/https://matrix.org/security-disclosure-policy/">Matrix.org's security policy</a>), I asked for an acknowledgement</dd>
    <dt><time>2023-07-21</time></dt>
    <dd>Report acknowledged by Matrix.org's security team</dd>
    <dt><time>2023-07-31</time></dt>
    <dd><a href="https://github.com/matrix-org/matrix-appservice-irc/commit/8bbd2b69a16cbcbeffdd9b5c973fd89d61498d75">Patch published</a> and <a href="https://github.com/matrix-org/matrix-appservice-irc/releases/tag/1.0.1">v1.0.1 released with the patch</a></dd>
    <dt><time>2023-08-04</time></dt>
    <dd>Vulnerability published as <a href="https://github.com/matrix-org/matrix-appservice-irc/security/advisories/GHSA-c7hh-3v6c-fj4q">GHSA-c7hh-3v6c-fj4q</a> and CVE-2023-38700</dd>
</dl>

                </section>

                <section>
                <h3 id="bug2">Bug #2: From the same room</h3>


<p>
This bug comes from the fix to the previous bug being partial:
it prevents crafting events to leak messages in a different rooms;
but not leaking messages in the same room.
</p>

<p>
This means that someone can use it to access the same first 32 characters of any past event of a room,
even if the room's history_visibility is not "world_readable".
Though this time it has the precondition this attacker is currently in the room.
</p>

<p>
Shortly after I sent the report, Matrix.org's triaged it as a low priority issue, as 1. they considered event ids to be hard to find, and 2. only recent messages could be leaked this way.
<br />
However, a malfunction in Matrix.org's ticketing system, causing this information not to be sent to me,
adding three months of delay before I could answer 1. event ids are easy to obtain for someone running their own homeserver and 2. matrix-appservice-irc allows replies to messages indefinitely far away in the past.
</p>

<p>
This vulnerability was then fixed by checking the sender of a reply last join event predates the message being replied to.
</p>

<h4>Timeline:</h4>

<dl>
    <dt><time>2023-11-28</time></dt>
    <dd>Reported to Matrix.org's security team</dd>
    <dt><time>2023-11-29</time></dt>
    <dd>Report acknowledged by Matrix.org's security team</dd>
    <dt><time>2024-02-29</time></dt>
    <dd>Given the lack of fixes within 90 days (unexpected given
    <a href="https://web.archive.org/web/20240226165023/https://matrix.org/security-disclosure-policy/">Matrix.org's security policy</a>), I asked for news</dd>
    <dt><time>2024-03-01</time></dt>
    <dd>Matrix.org's security team informs me their ticketing system lost a reply they meant to send on <time>2023-11-30</time>, and explains why they did not consider this issue a priority</dd>
    <dt><time>2024-03-01</time></dt>
    <dd>I replied with details</dd>
    <dt><time>2024-03-12</time></dt>
    <dd>Matrix.org's security team acknowledges the higher priority and notifies me they are working on a fix</dd>
    <dt><time>2024-03-20</time></dt>
    <dd><a href="https://github.com/matrix-org/matrix-appservice-irc/pull/1799">Pull request titled "Add tests for various forms of rich replies"</a> opened</dd>
    <dt><time>2024-03-26</time></dt>
    <dd><a href="https://github.com/matrix-org/matrix-appservice-irc/pull/1799/commits/ffcba6f2866f35577424931d5949a743e2fcf497">Commit titled "Ensure leaving the channel and rejoining doesn't allow you to quote-reply messages you haven't seen"</a> added to the pull request, containing the patch for the vulnerability</dd>
    <dt><time>2024-03-27</time></dt>
    <dd>Pull request merged as <a href="https://github.com/matrix-org/matrix-appservice-irc/commit/4af7d3009f10b1f2fb810784c1e491d9d3bee82b">a single commit titled "Add tests for various forms of rich replies"</a></dd>
    <dt><time>2024-04-08</time></dt>
    <dd><a href="">v2.0.0 released with the patch</a></dd>
    <dt><time>2024-04-11</time></dt>
    <dd>Vulnerability published as <a href="https://github.com/matrix-org/matrix-appservice-irc/security/advisories/GHSA-wm4w-7h2q-3pf7">GHSA-wm4w-7h2q-3pf7</a> and CVE-2024-32000</dd>
</dl>
                </section>

                <section>
                <h3 id="bug3">Bug #3: Timestamps are controlled by the sender</h3>

<p>
The fix to the previous vulnerability was missing an important detail:
in Matrix, all event timestamps (aka. <code>origin_server_ts</code>) in Matrix are provided by the sender of the event.
Timestamps can be, somewhat intentionally, far back in the past,
as Matrix's data model is eventually-consistent, allowing their events to form
a <abbr title="Directed Acyclic Graph">DAG</abbr>,
with some branches being potentially pretty old.
</p>

<p>
This allowed attackers to fake the timestamp of their join event,
so they could again leak any message they liked afterward, as 
<code>if (senderJoinTs > cachedEvent.timestamp) {"</code> in the above patch would always pass.
<br/>
Unlike the two bugs above, this requires the attacker to use a malicious homeserver,
which means private federations mostly not affected.
</p>

<p>
This is pretty tricky to fix, with three options:
</p>

<ol>
    <li>making the homeserver reject incoming join events with a origin_server_ts older than a few minutes (which breaks eventual consistency)</li>
    <li>making the bridge use the time its homeserver received the join event (assuming this is even possible with the <a href="https://spec.matrix.org/v1.11/application-service-api/">Application Service API</a>, which matrix-appservice-irc relies on)</li>
    <li>like 2 but making the bridge keep track of when it received the join event (which looked like it would be full of hard edge cases)</li>
</ol>

<p>
matrix-appservice-irc developers proceeded with option 3.
</p>


<h4>Timeline:</h4>

<dl>
    <dt><time>2024-04-11</time></dt>
    <dd>Reported to Matrix.org's security team</dd>
    <dt><time>2024-04-11</time></dt>
    <dd>Report acknowledged by Matrix.org's security team</dd>
    <dt><time>2024-05-21</time></dt>
    <dd><a href="https://github.com/matrix-org/matrix-appservice-irc/pull/1804">Patch published</a></dd>
    <dt><time>2024-06-18</time></dt>
    <dd>I notice the above fix being available publicly, and notify some bridge and IRC network operators</dd>
    <dt><time>2024-07-04</time></dt>
    <dd><a href="https://github.com/matrix-org/matrix-appservice-irc/releases/tag/2.0.1">v2.0.1 released with the patch</a>, vulnerability published as <a href="https://github.com/matrix-org/matrix-appservice-irc/security/advisories/GHSA-w9mh-5x8j-9754">GHSA-w9mh-5x8j-9754</a> and CVE-2024-39691</dd>
</dl>
                </section>

                <section>
                <h3 id="bug4">Bug #4: The bridge did not keep track of join event timestamps long enough</h3>

<p>
Unfortunately, the previous fix was still incomplete, as it <a href="https://github.com/matrix-org/matrix-appservice-irc/blob/bd06139191141ace39df5df6cfa975022d6b3ca9/src/bridge/MatrixHandler.ts#L150-L153">only keeps track of the 8192 most recent joins</a>, and then <a href="https://github.com/matrix-org/matrix-appservice-irc/blob/bd06139191141ace39df5df6cfa975022d6b3ca9/src/bridge/MatrixHandler.ts#L1355">falls back to assuming the user joined when the bridge started</a>.
Having an limit to the cache size makes sense to avoid memory growing unbounded; and falling back to the bridge start instead of the attacker-provided date avoids leaking very old messages.
</p>

<p>
However, this is not enough in practice. For example, the public OFTC bridge has about 20k users on the Matrix side; which means way more than 20k joins to keep track of as users are often joined to more than one room.
(The 20k figure is not official, but can be obtained as it coincides with the number of users OFTC loses when the bridge restarts, which is <a href="https://netsplit.de/networks/top10.php?year=2024">visible on netsplit.de</a>; up to July 2024 where they stopped auto-connecting idle users.)
And as the OFTC bridge restarts only about once a month (again, can be see on netsplit.de), this means that most messages on OFTC can be leaked by someone joining a channel within a few weeks.
</p>

<p>
After I told this to Matrix.org's security team, they decided to reconsider the solution to the previous bug, and pick a variant of option 2 above, which is to make the bridge rely on the homeserver tell it what events a user is allowed to see.
This makes sense as the homeserver already computes this information, it just does not make it available to the bridge.
This is materialized by <a href="https://github.com/matrix-org/matrix-spec-proposals/pull/4185"><acronym title="Matrix Spec Change">MSC</acronym>4185: Event Visibility API</a>, which is currently not implemented.
</p>

<h4>Timeline:</h4>

<dl>
    <dt><time>2024-07-07</time></dt>
    <dd>Reported to Matrix.org's security team</dd>
    <dt><time>2024-07-09</time></dt>
    <dd>Report acknowledged by Matrix.org's security team</dd>
    <dt><time>2024-08-29</time></dt>
    <dd> <a href="https://github.com/matrix-org/matrix-spec-proposals/pull/4185"><acronym title="Matrix Spec Change">MSC</acronym>4185: Event Visibility API</a> was opened, though I was not notified at the time</dd>
    <dt><time>2024-11-07</time></dt>
    <dd>After 120 days with no news, I told Matrix.org's security team I would publish this post in about two weeks</dd>
    <dt><time>2024-11-12</time></dt>
    <dd>Matrix.org's security team apologized for the lack of news, and told me they cannot fix it right away because they depend on MSC4185.</dd>
</dl>

                </section>

            </section>
        </div>
    </content>
</entry>
