<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Untitled Publication]]></title><description><![CDATA[Hey! I'm a Senior Software Engineer at JumpCloud, on the path to becoming a Principal Software Engineer!
I have worked in the IT industry for over 7 years now, ]]></description><link>https://blog.rodrigocaballero.net</link><generator>RSS for Node</generator><lastBuildDate>Fri, 24 Apr 2026 14:14:12 GMT</lastBuildDate><atom:link href="https://blog.rodrigocaballero.net/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Building a Reproducible Data Pipeline: Orchestrated Data Gathering and Model Training for Multi-Resident HAR]]></title><description><![CDATA[HAR Series — Part 4 · ~12 min read


A quick note: If you noticed a bigger gap than usual between posts, my apologies! Things have been hectic on personal and professional fronts — in the best way pos]]></description><link>https://blog.rodrigocaballero.net/building-a-reproducible-data-pipeline-orchestrated-data-gathering-and-model-training-for-multi-resident-har</link><guid isPermaLink="true">https://blog.rodrigocaballero.net/building-a-reproducible-data-pipeline-orchestrated-data-gathering-and-model-training-for-multi-resident-har</guid><category><![CDATA[Machine Learning]]></category><category><![CDATA[iot]]></category><category><![CDATA[Python]]></category><category><![CDATA[pytorch]]></category><category><![CDATA[Data Science]]></category><category><![CDATA[data-engineering]]></category><dc:creator><![CDATA[Rodrigo Caballero]]></dc:creator><pubDate>Mon, 09 Mar 2026 18:00:12 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/62558c9fb32ddd968bbebbde/a9ea9d18-f304-4497-95af-dc747b3104cc.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>HAR Series — Part 4 · ~12 min read</em></p>
<hr />
<blockquote>
<p><strong>A quick note:</strong> If you noticed a bigger gap than usual between posts, my apologies! Things have been hectic on personal and professional fronts — in the best way possible. I recently took on a <strong>Lead Software Engineer</strong> role at Capital One and presented this work at the University for my dissertation. Both experiences served as the ultimate stress test for the ideas explored in this series. Now, back to it.</p>
</blockquote>
<p>One of the most underestimated challenges in Human Activity Recognition (HAR) research is not model design — it's <strong>data acquisition at scale</strong>. For multimodal, multi-resident systems, collecting <em>consistent</em>, <em>well-labeled</em>, and <em>privacy-aware</em> data quickly becomes the dominant bottleneck.</p>
<p>In this phase of the project, I focused on designing and validating a <strong>data-gathering orchestrator mode</strong>: a system that allows researchers to boot the device, label an activity, and begin collecting synchronized multimodal data with minimal friction. This post walks through how the orchestrator was built, how data were collected over two days, and the key design decisions underlying model training.</p>
<hr />
<h2>01 — Why an Orchestrator-Centric Design?</h2>
<p>Early prototyping revealed a familiar failure mode in HAR research: data-collection logic scattered across sensor services, ad hoc scripts, and manual synchronization steps. The result is a fragile pipeline that is difficult to reproduce, extend, or hand off to another researcher.</p>
<p>To address this, I designed the <strong>Edge Orchestrator</strong> as the single source of truth for everything that matters at collection time:</p>
<ul>
<li><p>Sensor coordination and stream management</p>
</li>
<li><p>Activity labeling and buffering</p>
</li>
<li><p>Privacy-preserving preprocessing</p>
</li>
<li><p>Cloud upload and structured dataset generation</p>
</li>
</ul>
<p>The key inversion of control: <strong>sensors stream continuously, and the orchestrator decides when and how to capture data</strong>. This enables two clean execution modes — Data Gathering Mode and Predictor (Inference) Mode. This entry focuses on the former.</p>
<hr />
<h2>02 — What Is the Orchestrator?</h2>
<p>The Orchestrator is the backbone of the HAR system, coordinating data flow, preprocessing, and activity recognition across all tiers. At startup, it initiates a <strong>gRPC server</strong> that acts as the unified endpoint for all sensor services — receiving Protocol Buffer payloads and transforming them into Python objects for downstream processing. The public repository of the orchestrator can be found at the following URL: <a href="https://github.com/RodCaba/fp-orchestrator">https://github.com/RodCaba/fp-orchestrator</a></p>
<p>Researchers interact with the system through a dedicated UI built with HTML, JS, and CSS, exposed via <strong>FastAPI</strong> over HTTP. A WebSocket endpoint enables real-time updates, allowing the interface to display:</p>
<ul>
<li><p>Connection status of each sensor service</p>
</li>
<li><p>Currently identified inhabitants via RFID</p>
</li>
<li><p>Data batches processed in gathering mode, or the active prediction label in predictor mode</p>
</li>
</ul>
<img src="https://cdn.hashnode.com/uploads/covers/62558c9fb32ddd968bbebbde/19a54256-15d7-4889-acc2-875251e763fb.png" alt="" style="display:block;margin:0 auto" />

<p>Fig. 1 — The Orchestrator UI: sensor connection status, RFID presence, and real-time stream monitoring.</p>
<hr />
<h2>03 — Data Gathering Mode: From Label to Dataset</h2>
<p>From the researcher's perspective, the data-gathering flow is intentionally simple. The complexity lives in the system, not in the workflow:</p>
<ol>
<li><p><strong>Select an activity label</strong> from the orchestrator UI.</p>
</li>
<li><p><strong>Swipe at least one RFID tag</strong> to declare occupant presence.</p>
</li>
<li><p>Press the start activity button.</p>
</li>
<li><p><strong>Perform the activity naturally</strong> — no scripted behaviors required.</p>
</li>
<li><p>Let the system handle synchronization, preprocessing, and upload.</p>
</li>
</ol>
<img src="https://cdn.hashnode.com/uploads/covers/62558c9fb32ddd968bbebbde/5d572c63-9a5b-46bb-9c4a-4b67890f09c6.png" alt="" style="display:block;margin:0 auto" />

<p>Fig. 2 — Data flow across the edge, orchestration, and cloud tiers during a gathering session.</p>
<h3>RFID as the Collection Trigger</h3>
<p>RFID presence acts as the implicit trigger for data collection. When at least one unique tag is detected, and the start button is pressed, the orchestrator activates audio and IMU streams, begins buffering synchronized multimodal data, and records the number of detected users as a first-class feature in every sample.</p>
<p>This design choice ensures that collected data always corresponds to <em>actual occupancy</em> and the researcher's intentional decision that an activity has started or ended — not an automated timer or environmental trigger that could introduce label noise.</p>
<h3>Temporal Buffering and Cloud Upload</h3>
<p>Incoming data is streamed via gRPC into the orchestrator, where it is immediately preprocessed, anonymized (no raw audio is ever stored), and held in a <strong>temporal edge buffer</strong>. Once the buffer reaches <strong>10,000 records</strong>, the orchestrator:</p>
<ul>
<li><p>Serializes the batch into a structured JSON file</p>
</li>
<li><p>Uploads it to AWS S3</p>
</li>
<li><p>Clears the local buffer and resumes collection</p>
</li>
</ul>
<p>This batching strategy provides network efficiency, fault tolerance, and clean dataset segmentation for training. If a cloud upload fails, the system retains the batch locally to prevent data loss.</p>
<blockquote>
<p><strong>Design Candor:</strong> During the initial S3 uploads, some IMU readings were incorrectly labeled due to a configuration error. Because the system stores files as blobs, corrections cannot be made in place — the mislabeled data must be accounted for in the data-loading pipeline at training time. This is a known limitation and a reminder that error correction in blob storage requires pipeline-level handling, not in-place edits.</p>
</blockquote>
<hr />
<h2>04 — Two Days of Real-World Data Collection</h2>
<p>Using this orchestrated flow, I deployed the system in a <strong>three-person household kitchen</strong> over two days of naturalistic activity. The result:</p>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody><tr>
<td>Labeled data files</td>
<td>610</td>
</tr>
<tr>
<td>Records per file</td>
<td>~10,000</td>
</tr>
<tr>
<td>Total multimodal records</td>
<td>6M+</td>
</tr>
<tr>
<td>Processed, anonymized data</td>
<td>2.7 GB</td>
</tr>
</tbody></table>
<img src="https://cdn.hashnode.com/uploads/covers/62558c9fb32ddd968bbebbde/f3161f52-dbc3-4efb-b350-e1cf2ee49d66.png" alt="" style="display:block;margin:0 auto" />

<p>Fig. 3 — Data in the Cloud Tier.</p>
<p>Crucially, this data was collected <strong>without modifying the environment</strong> or forcing scripted behaviors. Activities emerged naturally, including concurrent and collaborative actions — exactly the scenarios that challenge multi-resident HAR systems.</p>
<p>Making realistic data collection easy enough to be repeated and extended by other researchers was one of the core goals of the orchestrator. This dataset validates that it works.</p>
<hr />
<h2>05 — "Boot and Collect": Researcher Experience</h2>
<p>From a usability standpoint, the system was designed to minimize setup overhead. A researcher can boot the edge device, connect the mobile IMU stream, open the orchestrator UI, and start collecting labeled data within minutes. In practice, the time from system startup to the availability of labeled data in S3 averaged roughly <strong>two minutes</strong>. This experience can be viewed in the following video:</p>
<p><a class="embed-card" href="https://youtu.be/XC-YxUs6ezs">https://youtu.be/XC-YxUs6ezs</a></p>
This metric matters more than it might seem:

<ul>
<li><p>Dataset growth becomes incremental instead of painful</p>
</li>
<li><p>New activity labels can be added without pipeline changes</p>
</li>
<li><p>Model retraining becomes an expected, repeatable step — not a risky undertaking</p>
</li>
</ul>
<p>The orchestrator effectively promotes data collection from an afterthought into a <strong>first-class system capability</strong>.</p>
<hr />
<h2>06 — Model Training: Design Decisions and Trade-offs</h2>
<p>In the previous entry, I introduced the <code>fp-orchestrator-utils</code> package, which provides a CLI for downloading and uploading proto definitions to S3, along with a wrapper for S3 operations via boto3. This package was extended to incorporate data loading, inference modal, and training logic. The package repoitory can be found on the following URL: <a href="https://github.com/RodCaba/fp-orchestrator-utils">https://github.com/RodCaba/fp-orchestrator-utils</a></p>
<h3>The DataLoader</h3>
<p>The <code>DataLoader</code> class establishes a secure S3 connection via environment variables and offers two primary modes: downloading all JSON data from an S3 bucket (with optional local caching), or loading from a local directory. The class processes data into feature and label arrays, performing cleaning, transformation, and class label encoding using <code>sklearn.LabelEncoder</code>.</p>
<h3>Model Architecture: Why Not One Big Network?</h3>
<p>Rather than collapsing all inputs into a monolithic architecture, I opted for <strong>modular processors per modality</strong>. Each sensor type gets its own dedicated processor; features are then fused through a shared attention mechanism. The rationale:</p>
<ul>
<li><p>IMU sensors have varying dimensionalities (orientation has 7 dimensions; others have 3)</p>
</li>
<li><p>Missing modalities can be handled gracefully by filling with zero tensors</p>
</li>
<li><p>Edge inference constraints favor modular, quantizable components</p>
</li>
<li><p>Independent processors allow targeted fine-tuning without retraining the full network</p>
</li>
</ul>
<p><strong>IMU Sensor Processor</strong></p>
<p>Each IMU sensor (accelerometer, gyroscope, total acceleration, gravity, orientation) has its own processor instance. The pipeline: linear projection → LSTM over the time sequence → average pooling over the time dimension → final projection with dropout.</p>
<p><strong>Audio Processor</strong></p>
<p>Input is a tensor of mel spectrograms. The pipeline: projection from mel bands to a fixed size → two LSTM layers (first processes within-segment time, second processes across segments) → average pooling over both dimensions → final projection with dropout.</p>
<h3>Attention as a Fusion Strategy</h3>
<p>Not all modalities are equally informative for all activities. The attention mechanism — inspired by Nakabayashi and Saito (2024) — allows the model to <em>learn</em> which sensors matter most in a given context: IMU-heavy signals during motion-intensive activities, audio-dominant cues during appliance-based actions. This proved especially valuable in collaborative and overlapping activity scenarios.</p>
<pre><code class="language-python"># har_model.py — HARModel forward pass (simplified)

class HARModel(nn.Module):
    def forward(self, sensor_data: dict[str, torch.Tensor], n_users: torch.Tensor):
        # 1. Run each available modality through its dedicated processor
        features = []
        for modality, processor in self.processors.items():
            if modality in sensor_data:
                features.append(processor(sensor_data[modality]))
            else:
                features.append(torch.zeros(batch_size, self.imu_feature_size))

        # 2. Fuse via cross-modal attention
        attended = self.feature_attention(features)

        # 3. Concatenate fused features with RFID-derived user count
        n_users_exp = n_users.float().unsqueeze(1)
        combined = torch.cat(attended + [n_users_exp], dim=1)

        # 4. Classify
        return self.classifier(combined)
</code></pre>
<hr />
<h2>07 — Training Strategy and the HARDataset</h2>
<p>Model training followed a conservative, reproducible workflow: data loaded directly from S3, explicit train/validation splits, best model checkpoint selected via validation accuracy, and final export to <strong>ONNX</strong> for edge deployment.</p>
<p>The dataset is encapsulated in a <code>HARDataset</code> class — a PyTorch dataset accessor designed for the open-source dataset derived from this collection effort. It includes a <strong>custom collate function</strong> to handle variable-length sequences across sensors with differing sampling frequencies. Each sample is structured as:</p>
<pre><code class="language-python"># har_dataset.py — sample structure

sample = {
    'features': upload_sample['features'],  # dict of sensor tensors
    'n_users':  upload_sample['n_users'],   # RFID-detected occupants
    'label':    self.labels[idx]            # encoded activity class
}
</code></pre>
<p>The module supports optional transformation functions that researchers can plug in per sample, making the dataset straightforward to extend without modifying the core pipeline.</p>
<p>Crucially, the training pipeline is <strong>repeatable</strong>: new data improves the model without architectural changes. This was a deliberate design goal from the start.</p>
<hr />
<h2>08 — What This Phase Enables</h2>
<p>This phase marks the transition from a working prototype to a <strong>research platform</strong> — a system designed not just to recognize activities, but to support the iterative nature of HAR experimentation. Practically, this means:</p>
<ul>
<li><p>Researchers can grow the dataset incrementally without rebuilding pipelines</p>
</li>
<li><p>New activity labels can be introduced organically as the research evolves</p>
</li>
<li><p>Model retraining becomes routine, not risky</p>
</li>
<li><p>The same system serves both research exploration and production deployment goals</p>
</li>
</ul>
<hr />
<p><em>Coming up next:</em> <em><strong>Evaluation, system metrics, and edge performance trade-offs</strong></em> <em>— what do they reveal about deploying multimodal HAR in real-world environments?</em></p>
]]></content:encoded></item><item><title><![CDATA[From Sensors to Streams: Finalizing the IMU Integration and Introducing the Orchestrator]]></title><description><![CDATA[Welcome to the third post in my ongoing series documenting the development of my Computer Science dissertation: a privacy-aware, multimodal Human Activity Recognition (HAR) system built to run on resource-constrained edge devices.This sprint focused ...]]></description><link>https://blog.rodrigocaballero.net/from-sensors-to-streams</link><guid isPermaLink="true">https://blog.rodrigocaballero.net/from-sensors-to-streams</guid><category><![CDATA[iot]]></category><category><![CDATA[AI]]></category><category><![CDATA[Raspberry Pi]]></category><category><![CDATA[har]]></category><category><![CDATA[automation]]></category><category><![CDATA[Python]]></category><category><![CDATA[pypi]]></category><dc:creator><![CDATA[Rodrigo Caballero]]></dc:creator><pubDate>Fri, 25 Jul 2025 23:53:57 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1753487546861/d690c1f5-79d9-4c4e-acaf-69ad77a5f7a5.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome to the third post in my ongoing series documenting the development of my Computer Science dissertation: a <strong>privacy-aware, multimodal Human Activity Recognition (HAR) system</strong> built to run on <strong>resource-constrained edge devices</strong>.<br />This sprint focused on two major components:</p>
<ul>
<li><p>Finalizing the <strong>Inertial Measurement Unit (IMU) sensor integration</strong> using Message Queuing Telemetry Transport (MQTT) for mobile streaming.</p>
</li>
<li><p>Designing the first layer of the <strong>orchestrator</strong>, which bridges the sensor and cloud layers to build a curated dataset.</p>
</li>
</ul>
<h2 id="heading-imu-sensor-integration-via-mqtt">IMU Sensor Integration via MQTT</h2>
<p>To integrate motion data from mobile devices, I used the <a target="_blank" href="https://play.google.com/store/apps/details?id=com.kelvin.sensorapp&amp;hl=en">Sensor Logger app</a> <a target="_blank" href="https://play.google.com/store/apps/details?id=com.kelvin.sensorapp&amp;hl=en">for Android. Its</a> premium version supports <strong>MQTT</strong>, allowing seamless publishing of IMU data from the phone to a central broker.</p>
<p>The following architecture exemplifies the MQTT Publish / Subscribe architecture:</p>
<p><img src="https://mqtt.org/assets/img/mqtt-publish-subscribe.png" alt="MQTT: publish / subscribe architecture" /></p>
<p><em>Source:</em> <a target="_blank" href="https://mqtt.org"><em>https://mqtt.org</em></a></p>
<p>MQTT, in short, hosts clients that publish to <em>topics</em>, and subscribers receive those messages.</p>
<p>I used <a target="_blank" href="https://mosquitto.org">Eclipse Mosquitto</a> as my MQTT broker, hosted on the Raspberry Pi.</p>
<p>Here's a sample configuration to enable public access for testing:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Allow connections from network</span>
listener 1883 0.0.0.0

<span class="hljs-comment"># Allow anonymous connections (for testing)</span>
allow_anonymous <span class="hljs-literal">true</span>

<span class="hljs-comment"># Logging</span>
log_dest stdout
log_type all

<span class="hljs-comment"># Persistence</span>
persistence <span class="hljs-literal">false</span>
</code></pre>
<p>Then we start the broker:</p>
<pre><code class="lang-bash">rodrigo@raspberrypi:~/fp-imu-service $ mosquitto -c mosquitto.conf -v
1752622740: mosquitto version 2.0.11 starting
1752622740: Config loaded from mosquitto.conf.
1752622740: Opening ipv4 listen socket on port 1883.
1752622740: mosquitto version 2.0.11 running
</code></pre>
<p>There might already be a <code>mosquitto</code> process running after installation. To run the process with the desired configuration, we need to stop the current process using the command: <code>pkill mosquitto</code>.</p>
<h2 id="heading-sensor-layer-architecture-overview">Sensor Layer Architecture Overview</h2>
<p>The following diagram depicts the architecture of the sensor layer with the MQTT broker:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753115645717/e0c19207-e5ea-4108-8d84-e2921112fb2c.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-mqtt-topics">MQTT Topics:</h3>
<ol>
<li><p><code>recording_control</code>: instructs mobile devices when to start/stop logging.</p>
</li>
<li><p><code>data_stream</code>: receives IMU payloads from devices.</p>
</li>
</ol>
<h3 id="heading-participants">Participants:</h3>
<ul>
<li><p><strong>RFID service</strong> → Publishes control messages (start/stop)</p>
</li>
<li><p><strong>Mobile device</strong> → Subscribes to control, publishes IMU data</p>
</li>
<li><p><strong>IMU service</strong> → Subscribes to data, then buffers and processes payloads</p>
</li>
</ul>
<h2 id="heading-creating-a-reusable-mqtt-client-fp-mqtt-broker">Creating a Reusable MQTT Client: <code>fp-mqtt-broker</code></h2>
<p>I would perform MQTT broker operations on multiple devices and services. To reduce boilerplate across services (connection, disconnection, and message handling logic), I created and published a Python package: <a target="_blank" href="https://pypi.org/project/fp-mqtt-broker/">fp-mqtt-broker on PyPI</a></p>
<p>This package simplifies and offers an easy-to-use interface for connecting a device to an MQTT broker. It includes a ready-to-use implementation of the <a target="_blank" href="https://pypi.org/project/paho-mqtt/">paho-mqtt</a> client, but it is designed to be flexible and not dependent on this client, using a factory pattern to create MQTT brokers for:</p>
<ul>
<li><p>Connection setup</p>
</li>
<li><p>Topic subscriptions</p>
</li>
<li><p>Message handling (via a factory + handler interface)</p>
</li>
</ul>
<p>This is the basic example of using the package to create a new client to the MQTT broker and assign a custom message handler:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fp_mqtt_broker <span class="hljs-keyword">import</span> BrokerFactory
<span class="hljs-keyword">from</span> fp_mqtt_broker.abstractions <span class="hljs-keyword">import</span> MessageHandler

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">MyHandler</span>(<span class="hljs-params">MessageHandler</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_subscribed_topics</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-keyword">return</span> [<span class="hljs-string">'my/topic'</span>]
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handle_message</span>(<span class="hljs-params">self, topic, payload</span>):</span> print(<span class="hljs-string">f"→ <span class="hljs-subst">{payload}</span>"</span>)

broker = BrokerFactory.create_broker(config, [MyHandler()])
broker.connect()
broker.publish_message(<span class="hljs-string">"my/topic"</span>, <span class="hljs-string">"hello world"</span>)
</code></pre>
<h2 id="heading-buffering-imu-data">Buffering IMU Data</h2>
<p>The new <a target="_blank" href="https://github.com/RodCaba/fp-imu-service">IMU service</a> includes a buffer system for:</p>
<ul>
<li><p>Accelerometer</p>
</li>
<li><p>Gyroscope</p>
</li>
<li><p>Gravity</p>
</li>
<li><p>Orientation (quaternions, pitch/roll/yaw)</p>
</li>
</ul>
<p>Each reading is validated before being added to a capped buffer.</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">IMUBuffer</span>:</span>
    <span class="hljs-string">"""Class to manage the IMU data buffer."""</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">validate_sensor_values</span>(<span class="hljs-params">self, values, name</span>):</span>
        <span class="hljs-string">"""Validate the structure of sensor values."""</span>
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> isinstance(values, dict):
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"Values must be a JSON object"</span>)

        <span class="hljs-comment"># Check for required fields in values</span>
        required_fields = [<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>, <span class="hljs-string">'z'</span>]
        <span class="hljs-comment"># Orientation requires a different structure.</span>

        <span class="hljs-keyword">for</span> field <span class="hljs-keyword">in</span> required_fields:
            <span class="hljs-keyword">if</span> field <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> values:
                <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Missing required field: <span class="hljs-subst">{field}</span>"</span>)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">add_to_buffer</span>(<span class="hljs-params">self, data, buffer</span>):</span>
        <span class="hljs-string">"""Add new IMU data to the buffer."""</span>
        <span class="hljs-keyword">if</span> len(buffer) &gt;= self.max_size:
            buffer.pop(<span class="hljs-number">0</span>)  <span class="hljs-comment"># Remove oldest data</span>
        buffer.append(data)
</code></pre>
<p>An <code>IMUMessageHandler</code> subscribes to the <code>data_stream</code> topic and pushes incoming data into the buffer. Payloads are expected in the form:</p>
<pre><code class="lang-python">{
  <span class="hljs-string">"payload"</span>: [
    { <span class="hljs-string">"name"</span>: <span class="hljs-string">"accelerometer"</span>, <span class="hljs-string">"values"</span>: { <span class="hljs-string">"x"</span>: ..., <span class="hljs-string">"y"</span>: ..., <span class="hljs-string">"z"</span>: ... } },
    ...
  ]
}
</code></pre>
<h2 id="heading-updated-rfid-service-behavior">Updated RFID Service Behavior</h2>
<p>The <a target="_blank" href="https://github.com/RodCaba/fp-rfid-reader-service">RFID service</a> now:</p>
<ul>
<li><p>Connects to the MQTT broker using <code>fp-mqtt-broker</code></p>
</li>
<li><p>Publishes <code>start</code> or <code>stop</code> commands based on tag swipes</p>
</li>
<li><p>Simultaneously triggers audio recognition via gRPC</p>
</li>
</ul>
<p>If you don't recall the details and tasks of the RFID service, you can revisit the previous entry at <a target="_blank" href="https://typo.hashnode.dev/from-rfid-to-recognition">https://typo.hashnode.dev/from-rfid-to-recognition</a>. In short, it is a service that manages GPIO connections with the Raspberry Pi, including an RC522 RFID reader, and interacts with devices like an LCD screen, buzzers, and LEDs.</p>
<p>After the connection is set, the MQTT broker will publish a message to the “recording_control” channel to instruct the start or end of IMU data gathering, depending on the RFID swipe timing.</p>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> IS_READING:
          <span class="hljs-comment"># ==================== MQTT BROKER MESSAGE PUBLICATION ===========================</span>
          mqtt_broker.publish_message(
            topic=config[<span class="hljs-string">'mqtt'</span>][<span class="hljs-string">'topics'</span>][<span class="hljs-string">'recording_control'</span>],
            payload=json.dumps({<span class="hljs-string">"action"</span>: <span class="hljs-string">"start"</span>, <span class="hljs-string">"session_id"</span>: id})
          )

<span class="hljs-keyword">else</span>:       
          <span class="hljs-comment"># ==================== MQTT BROKER MESSAGE PUBLICATION ===========================</span>
          mqtt_broker.publish_message(
            topic=config[<span class="hljs-string">'mqtt'</span>][<span class="hljs-string">'topics'</span>][<span class="hljs-string">'recording_control'</span>],
            payload=json.dumps({<span class="hljs-string">"action"</span>: <span class="hljs-string">"stop"</span>, <span class="hljs-string">"session_id"</span>: id})
          )
</code></pre>
<h2 id="heading-integration-results">Integration Results</h2>
<p>The video below shows the result of integrating MQTT with the sensor layer.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/feGNTDhGYHw">https://youtu.be/feGNTDhGYHw</a></div>
<p> </p>
<p>As the video shows, the sensor layer is fully connected, and each RFID tag swipe tells the system to start or stop gathering IMU and audio data.</p>
<h2 id="heading-introducing-the-orchestrator">Introducing: The Orchestrator</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753459412861/31bf5c7b-7798-4698-bc73-0b81b2a1284a.png" alt class="image--center mx-auto" /></p>
<p>With the sensor layer complete, it’s time to go <strong>up the stack</strong>. The <strong>orchestrator</strong> bridges the edge and cloud layers. It will:</p>
<ul>
<li><p>Provide a User Interface (UI) for labeling activity data.</p>
</li>
<li><p>Store sensor data in an <strong>AWS S3-based Data Lake</strong>.</p>
</li>
<li><p>Provide a communication bridge between the sensor layer and the cloud.</p>
</li>
</ul>
<p>The next diagram shows the planned architecture with the orchestrator:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753460891075/edade89a-00bb-4fd8-8c0e-009de35acc57.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-data-lake-design">Data Lake Design</h2>
<p>A Data Lake is a central place to store all your data, both structured and unstructured, at any scale. It keeps data in its raw form until needed for analysis, offering flexibility and scalability. Learn more about data lakes <a target="_blank" href="https://aws-amazon-com.translate.goog/what-is/data-lake">here</a>.</p>
<p>In this Human Activity Recognition system, a Data Lake will store large amounts of diverse data, like IMU and audio data, from various sensors. This data collection is vital for creating a curated, annotated dataset to train machine learning models for recognizing human activities in multi-household settings. The Data Lake's ability to handle different data types and scale makes it perfect for this system's data needs.</p>
<h2 id="heading-fp-orchestrator-utils-python-package">fp-orchestrator-utils Python Package</h2>
<p>All services on the sensor layer need the orchestrator's new proto definitions to communicate, and duplicating these definitions on each service repository is inefficient. So, I created a public Python package, <a target="_blank" href="https://pypi.org/project/fp-orchestrator-utils/0.1.0/">fp-orchestrator-utils</a>, to provide utilities for the orchestrator. Unlike the fp-mqtt-broker package, this is more tailored to the project and supports only AWS S3, but the code is designed to be expandable for other use cases and cloud vendors.</p>
<h3 id="heading-features">Features:</h3>
<ul>
<li><p>CLI for downloading, generating, and uploading gRPC protos from S3</p>
</li>
<li><p>Programmatic S3 data operations with boto3 under the hood</p>
</li>
</ul>
<pre><code class="lang-bash">fp-orchestrator-utils proto download
fp-orchestrator-utils proto generate
</code></pre>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fp_orchestrator_utils <span class="hljs-keyword">import</span> S3Service

s3.save(<span class="hljs-string">"data"</span>, <span class="hljs-string">"datalake/raw/data.csv"</span>)
</code></pre>
<h2 id="heading-further-considerations">Further Considerations</h2>
<ul>
<li><p>Previously, the RFID service <em>controlled and triggered</em> the sensor layer. This role will be passed to the orchestrator, so further refactorization include:</p>
<ul>
<li><p>RFID will identify household members via tag IDs.</p>
</li>
<li><p>The orchestrator will manage the start/stop logic via UI.</p>
</li>
<li><p>All services (IMU, Audio) will send data directly to the orchestrator over gRPC.</p>
</li>
</ul>
</li>
<li><p>Security is also an important consideration for the future.</p>
<ul>
<li>The MQTT broker's <code>allow_anonymous</code> setting allows unwanted anonymous connections. To ensure privacy, we must block these connections. After testing, we should also implement TLS (Transport Layer Security) and authentication protocols for the MQTT broker.</li>
</ul>
</li>
</ul>
<h2 id="heading-next-steps">Next steps</h2>
<p>In the next sprint, we will focus on finalizing changes to the sensor layer so that each service can communicate directly with the orchestrator. Additionally, I will set up the data lake to collect enough data to create a curated, annotated, and open-source dataset. This dataset will be used for this project and others related to Human Activity Recognition in multi-household settings.</p>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>This sprint marked the transition from low-level integration to <strong>system-wide coordination</strong>. The sensor layer is now modular, testable, and fully interconnected. With MQTT and gRPC in place, the system is ready to scale and support richer functionality like annotation and dataset curation.</p>
<p>Follow the series at <a target="_blank" href="https://typo.hashnode.dev">typo.hashnode.dev</a> or explore the code on GitHub.</p>
]]></content:encoded></item><item><title><![CDATA[From RFID to Recognition: Integrating Sensor Layers for Privacy-Aware HAR on the Edge.]]></title><description><![CDATA[Welcome to the second entry in my blog series documenting the development of my dissertation project for a Computer Science degree. This series explores the construction of a privacy-aware, multimodal Human Activity Recognition (HAR) system, designed...]]></description><link>https://blog.rodrigocaballero.net/from-rfid-to-recognition</link><guid isPermaLink="true">https://blog.rodrigocaballero.net/from-rfid-to-recognition</guid><category><![CDATA[Raspberry Pi]]></category><category><![CDATA[Python]]></category><category><![CDATA[har]]></category><category><![CDATA[Machine Learning]]></category><dc:creator><![CDATA[Rodrigo Caballero]]></dc:creator><pubDate>Thu, 10 Jul 2025 20:32:48 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1752179323040/9bd675da-4551-4b5b-a46d-7ac8e543c344.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome to the second entry in my blog series documenting the development of my dissertation project for a Computer Science degree. This series explores the construction of a <strong>privacy-aware, multimodal Human Activity Recognition (HAR) system</strong>, designed to run on <strong>resource-constrained edge devices</strong>. The system monitors activity in a <strong>shared environment</strong>, using audio, IMU, and RFID sensor inputs.</p>
<h2 id="heading-sprint-focus-june-16th-29th">Sprint Focus (June 16th - 29th)</h2>
<p>This sprint covered the foundational infrastructure for the sensor layer, focusing on RFID and integration work. The objectives were:</p>
<ul>
<li><p>Build and test an <strong>RFID sensor layout.</strong></p>
</li>
<li><p>Establish <strong>unit and end-to-end testing frameworks.</strong></p>
</li>
<li><p>Connect the <strong>Audio and RFID services</strong> via <strong>gRPC protocol.</strong></p>
</li>
</ul>
<h2 id="heading-enhancements-from-previous-stage">Enhancements from Previous Stage</h2>
<ol>
<li><p>The full version of Raspberry Pi OS was replaced with the CLI-based Raspberry Pi OS Lite (64-bit). This version reduces disk space usage by ten times and improves access to volatile memory by reducing the number of processes.</p>
</li>
<li><p>The ONNX model and the <code>best_model.pth</code> file created by PyTorch were saved directly using Git version control. However, with each file being about 30MB, this significantly impacted the audio service repository pulls from Git. To improve this, I used <a target="_blank" href="https://git-lfs.com">Git Large File Storage</a>, which replaces files with text pointers in version control and allows the actual files to be pulled when needed by the Raspberry Pi.</p>
</li>
</ol>
<h2 id="heading-sensor-layout-plan">Sensor Layout Plan</h2>
<p>The initial task was to plan the main layout of the sensor layer. This layer is responsible for collecting data from the environment for further processing. The HAR system will gather environmental data such as sound, Inertial Measurement Unit (IMU) data, and RFID tag data. The following diagram shows the user flow that will be followed to collect this sensor data for the project's development.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750798110119/0e78a305-0c3a-4218-8c1d-9875a750c926.png" alt class="image--center mx-auto" /></p>
<p>The RFID tag swipe process mimics how a smart home environment works when someone arrives home and uses an RFID tag to gain access. In this sequence diagram, two important changes to the initial assumptions are made:</p>
<ol>
<li><p>The plan was to use a long-range RFID reader so that when a user enters the kitchen, data collection would begin automatically. However, long-range RFID readers are more expensive and harder to set up. For this stage of the project, I decided to use the low-range, low-cost RFID reader RC522. This reader is also easy to integrate with the General Purpose Input Output (GPIO) pins of the Raspberry Pi 4.</p>
</li>
<li><p>General-purpose sensors, like the MPU6050, would be used to gather IMU data. However, using these sensors requires a microcontroller to be powered on to process the data, which can be more inconvenient than simply using a mobile phone. The mobile phone uses an app that sends the IMU data through an HTTP client.</p>
</li>
</ol>
<p>In this entry, I will focus on developing the RFID tag and its logic, which involves the first part of the sequence diagram: from the user to the buzzer.</p>
<p>The breadboard schematic below shows the sensor layout.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750796222947/afdcec85-e9aa-4f9d-967a-3f3aef5e420c.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-list-of-components">List of components</h3>
<ul>
<li><p>1 × 16 × 2 LCD screen.</p>
</li>
<li><p>1 x I2C backpack for LCD screen.</p>
</li>
<li><p>1 x RFID RC522</p>
</li>
<li><p>1 x Active buzzer</p>
</li>
<li><p>1 x Red LED</p>
</li>
<li><p>1 x Green LED</p>
</li>
<li><p>19 x Jumper wires</p>
</li>
<li><p>2 × 220 Ω resistors.</p>
</li>
</ul>
<p>The following section explains the development and testing environment used to implement this sensor layer. It's important to note that <strong>this environment</strong> should be the standard for other project integrations and services.</p>
<h2 id="heading-development-environment">Development Environment</h2>
<p>To interact with the sensors, I will develop a Python package on a Windows machine using Microsoft's WSL (Windows Subsystem for Linux) with an Ubuntu distribution. This setup ensures that the resulting application is tailored to run on a Linux-based system, such as the Raspberry Pi OS Lite. You can find the code in the following repository: <a target="_blank" href="https://github.com/RodCaba/fp-rfid-reader-service">https://github.com/RodCaba/fp-rfid-reader-service</a>.</p>
<p>The external dependencies used by the repository are listed in the <code>README.md</code> file.</p>
<pre><code class="lang-plaintext">spidev==3.7
mfrc522==0.0.7
pytest==8.4.1
pytest-mock==3.14.1
RPLCD==1.4.0
smbus2==0.5.0
coverage==7.9.1
pytest-cov==6.2.1
</code></pre>
<h3 id="heading-code-layout">Code Layout</h3>
<p>Each sensor interactor in the code includes the following components: (I’m using the LCD I2C Display interactor as an example)</p>
<ol>
<li><p>A service that is started by external libraries, such as the script used for end-to-end testing of the interactors or any other external services.</p>
</li>
<li><pre><code class="lang-python">   <span class="hljs-keyword">from</span> .base <span class="hljs-keyword">import</span> Writer

   <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">LCDService</span>:</span>
     <span class="hljs-string">"""
     A service class for managing LCD operations.

     Attributes:
       writer: An instance of a Writer class that handles LCD writing operations.
     """</span>
     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, writer: Writer</span>):</span>
       <span class="hljs-string">"""
       Initializes the LCDService with a specific Writer instance.

       Args:
         writer (Writer): An instance of a Writer class that implements the LCD writing functionality.
       """</span>
       self.writer = writer

     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">write</span>(<span class="hljs-params">self, text: str</span>):</span>
       <span class="hljs-string">"""
       Writes text to the LCD display.

       Args:
         text (str): The text to be displayed on the LCD.
       """</span>
       <span class="hljs-keyword">try</span>:
         self.writer.write(text)
       <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
         print(<span class="hljs-string">f"Error writing to LCD: <span class="hljs-subst">{e}</span>"</span>)

     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">clear</span>(<span class="hljs-params">self</span>):</span>
       <span class="hljs-string">"""
       Clears the LCD display.

       This method calls the clear method of the Writer instance to clear any text currently displayed.
       """</span>
       <span class="hljs-keyword">try</span>:
         self.writer.clear()
       <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
         print(<span class="hljs-string">f"Error clearing LCD: <span class="hljs-subst">{e}</span>"</span>)
</code></pre>
</li>
<li><p>An abstract class for the sensor interactor is passed to the service. This abstraction is designed for the dependency inversion principle (part of the <a target="_blank" href="https://en.wikipedia.org/wiki/Dependency_inversion_principle">SOLID</a> principles) and allows the services to run unit tests on devices other than the intended hardware, like the Raspberry Pi. For example, importing the GPIO module on a device other than a Raspberry Pi would cause a runtime error: <code>from RPi._GPIO import * RuntimeError: This module can only be run on a Raspberry Pi!</code></p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> abc <span class="hljs-keyword">import</span> ABC, abstractmethod

 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Writer</span>(<span class="hljs-params">ABC</span>):</span>
     <span class="hljs-string">"""
     Abstract base class for LCD writers.

     This class defines the interface that all concrete LCD writer 
     implementations must follow.
     """</span>
     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">
             self,
             i2c_expander=<span class="hljs-string">"PCF8574"</span>,
             address=<span class="hljs-number">0x27</span>,
             port=<span class="hljs-number">1</span>,
             cols=<span class="hljs-number">16</span>,
             rows=<span class="hljs-number">2</span>,
             dotsize=<span class="hljs-number">8</span>,
         </span>):</span>
         <span class="hljs-string">"""
         Initialize the LCD writer.

         This method can be overridden by subclasses to perform any necessary
         setup for the LCD display.
         """</span>
         self.i2c_expander = i2c_expander
         self.address = address
         self.port = port
         self.cols = cols
         self.rows = rows
         self.dotsize = dotsize

     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__del__</span>(<span class="hljs-params">self</span>):</span>
         <span class="hljs-string">"""
         Clean up resources when the LCD writer is deleted.

         The LCD display should be cleared to ensure no residual text
         remains when the writer is no longer in use.
         """</span>
         <span class="hljs-keyword">try</span>:
             self.clear()
         <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
             print(<span class="hljs-string">f"Error during cleanup: <span class="hljs-subst">{e}</span>"</span>)

<span class="hljs-meta">     @abstractmethod</span>
     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">write</span>(<span class="hljs-params">self, text: str</span>):</span>
         <span class="hljs-string">"""
         Write text to the LCD display.

         Args:
             text (str): The text to display on the LCD.

         Raises:
             Exception: If there's an error writing to the display.
         """</span>
         <span class="hljs-keyword">pass</span>

<span class="hljs-meta">     @abstractmethod</span>
     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">clear</span>(<span class="hljs-params">self</span>):</span>
         <span class="hljs-string">"""
         Clear the LCD display.

         This method should be called to clear any text currently displayed
         on the LCD.
         Raises:
             Exception: If there's an error clearing the display.
         """</span>
         <span class="hljs-keyword">pass</span>
</code></pre>
</li>
<li><p>One or more concrete implementations of the sensor interaction abstraction. This involves importing external modules and implementing the abstraction functions.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> ..base <span class="hljs-keyword">import</span> Writer
 <span class="hljs-keyword">from</span> RPLCD.i2c <span class="hljs-keyword">import</span> CharLCD

 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">CharLCDWriter</span>(<span class="hljs-params">Writer</span>):</span>
     <span class="hljs-string">"""
     Concrete implementation of the Writer interface for character LCD displays.
     """</span>

     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">
             self,
             i2c_expander=<span class="hljs-string">"PCF8574"</span>,
             address=<span class="hljs-number">0x27</span>,
             port=<span class="hljs-number">1</span>,
             cols=<span class="hljs-number">16</span>,
             rows=<span class="hljs-number">2</span>,
             dotsize=<span class="hljs-number">8</span>,
         </span>):</span>
         super().__init__(
             i2c_expander=i2c_expander,
             address=address,
             port=port,
             cols=cols,
             rows=rows,
             dotsize=dotsize,
         )
         self.lcd = CharLCD(
             i2c_expander=i2c_expander,
             address=address,
             port=port,
             cols=cols,
             rows=rows,
             dotsize=dotsize,
         )

     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">write</span>(<span class="hljs-params">self, text: str</span>):</span>
         <span class="hljs-string">"""
         Write text to the LCD display.

         Args:
             text (str): The text to display on the LCD.

         Raises:
             Exception: If there's an error writing to the display.
         """</span>
         <span class="hljs-keyword">try</span>:
             self.lcd.write_string(text)
         <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
             <span class="hljs-keyword">raise</span> Exception(<span class="hljs-string">f"Error writing to LCD: <span class="hljs-subst">{e}</span>"</span>)

     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">clear</span>(<span class="hljs-params">self</span>):</span>
         <span class="hljs-string">"""
         Clear the LCD display.

         Raises:
             Exception: If there's an error clearing the display.
         """</span>
         <span class="hljs-keyword">try</span>:
             self.lcd.clear()
         <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
             <span class="hljs-keyword">raise</span> Exception(<span class="hljs-string">f"Error clearing LCD: <span class="hljs-subst">{e}</span>"</span>)
</code></pre>
</li>
</ol>
<p>In summary, each hardware interaction is abstracted and injected into a service to:</p>
<ul>
<li><p>Avoid hardware, such as <code>RPi.GPIO</code> , errors on non-Pi machines</p>
</li>
<li><p>Enable full mocking and testing</p>
</li>
<li><p>Follow <strong>Dependency Inversion</strong> and <strong>Open/Closed</strong> principles</p>
</li>
</ul>
<h2 id="heading-testing-environment">Testing Environment</h2>
<h3 id="heading-unit-testing">Unit Testing</h3>
<p>Under the <code>tests</code> folder, the layout of the <code>src</code> folder is copied so that each application service has its own set of unit tests. This suite includes unit tests for the sensor service and each specific implementation of the sensor interactor.</p>
<p>The project runs unit tests on every push or pull request to the master branch using a GitHub action workflow. This ensures that everything added to the master branch passes the unit test suites.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750884302722/9fd57ddf-be52-4873-8f2b-3f5f372e3983.png" alt class="image--center mx-auto" /></p>
<p>Furthermore, one of the project's goals is to achieve over 80% statement test coverage in unit tests. The current coverage report from <code>pytest-cov</code> shows a 95% test coverage.</p>
<pre><code class="lang-bash">======================================================================= tests coverage =======================================================================
______________________________________________________ coverage: platform linux, python 3.10.12-final-0 ______________________________________________________

Name                                           Stmts   Miss  Cover
------------------------------------------------------------------
src/gpio/gpio_controller.py                       23      0   100%
src/lcd/base.py                                   20      2    90%
src/lcd/implementations/charlcd_writer.py         16      0   100%
src/lcd/lcd_service.py                            14      0   100%
src/reader/base.py                                 8      2    75%
src/reader/implementations/mfrc522_reader.py      16      1    94%
src/reader/reader_service.py                      11      0   100%
------------------------------------------------------------------
TOTAL                                            108      5    95%
</code></pre>
<h3 id="heading-integration-tests">Integration Tests</h3>
<p>In addition to the unit test setup, the repository includes an <code>integration</code> folder with integration tests. These tests ensure that the integration between services works as expected. They are labeled with an "integration" tag using the <code>pytest</code> framework and, like unit tests, are run in the GitHub action workflow when there is a push or pull request to the master branch.</p>
<h3 id="heading-end-to-end-tests">End to End Tests</h3>
<p>Finally, a set of End to End tests was set up for the User to Buzzer interaction. A format was developed, which can be found at this <a target="_blank" href="https://docs.google.com/spreadsheets/d/e/2PACX-1vSYmIEgCyzeMgoAVpV2uuvP_5tXgmkRH6a3YgAs-i6lea1IeYeNCrSGrvv3DSJGfw/pubhtml">URL</a>. This format describes the test scenarios, execution log, and the software and hardware specifications of the tests. The end-to-end script implements the intended functionality and is run to test the scenarios.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/XC2Hvi93cXg">https://youtu.be/XC2Hvi93cXg</a></div>
<p> </p>
<h2 id="heading-building-the-sensor-layout">Building the Sensor Layout</h2>
<h3 id="heading-lcd-display-i2c">LCD Display I2C</h3>
<p>The first step is to solder the I2C backpack to the LCD display. Some versions of the LCD display come with the backpack already soldered, which can save you this step (especially if you're not great at soldering like me). Next, I enabled the I2C interface using the <code>raspi-config</code> module and installed the necessary tools with <code>sudo apt-get install i2c-tools python3-smbus</code>.</p>
<p>After connecting the LCD to the Raspberry Pi, you need to detect the I2C bus. Run the following command and take note of the address given:</p>
<pre><code class="lang-bash">$ sudo i2cdetect 1

WARNING! This program can confuse your I2C bus, cause data loss and worse!
I will probe file /dev/i2c-1.
I will probe address range 0x08-0x77.
Continue? [Y/n] Y
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:                         -- -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
20: -- -- -- -- -- -- -- 27 -- -- -- -- -- -- -- --
30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
70: -- -- -- -- -- -- -- --
</code></pre>
<p>This matrix shows that there is an I2C bus detected at address 0x27. You will need to add this address to the code when starting the LCD service.</p>
<h3 id="heading-rfid-reader">RFID Reader.</h3>
<p>Once the RFID reader is soldered, I connect it to the GPIOs of the Raspberry Pi, enable the SPI interface using the <code>raspi-config</code> tool and then reboot the device. The MFRC522 Python package includes a <code>SimpleMFRC522</code> class that makes it easier to initialize and communicate with the reader. However, there is a known compatibility issue with newer RFID tags that do not support authentication protocols. When you run the reader function and detect a new RFID tag (like an NTAG215), the following runtime error will occur:</p>
<pre><code class="lang-bash"><span class="hljs-string">"AUTH ERROR!!
AUTH ERROR(status2reg &amp; 0x08) != 0"</span>
</code></pre>
<p>To solve this, you need to use the <code>MFRC522</code> class instead. You can find more details about the issue on this GitHub page, where the solution was sourced: <a target="_blank" href="https://github.com/pimylifeup/MFRC522-python/issues/31">https://github.com/pimylifeup/MFRC522-python/issues/31</a></p>
<h2 id="heading-grpc-services-connection">gRPC Services Connection</h2>
<p>To establish a connection between both services, I decided to use the gRPC protocol. This choice was driven by the need for <strong>enhanced performance</strong> and the compatibility with the solution's <strong>microservices architecture</strong>. The gRPC protocol is well-suited for this task because it allows for efficient communication between distributed systems. In this setup, the audio service will be made accessible, and the RFID service will send a request to initiate the audio recognition loop whenever an RFID tag is detected.</p>
<p>The initial step in this process involves defining the Protocol Buffer, which serves as the interface definition language for the service. This definition will specify the structure of the data and the methods that the services will use to communicate.</p>
<pre><code class="lang-plaintext">syntax = "proto3";

package audio_service;

// Audio processing service
service AudioService {
    // Start audio recording and processing
    rpc StartAudioProcessing(AudioRequest) returns (AudioResponse);

    // Get the status of audio processing
    rpc GetProcessingStatus(StatusRequest) returns (StatusResponse);

    // Health check
    rpc HealthCheck(HealthCheckRequest) returns (HealthCheckResponse);
}

// Request message for audio processing
message AudioRequest {
    string session_id = 1;           // Unique session identifier
    int32 recording_duration = 2;    // Duration in seconds
    string output_format = 3;        // Output format (wav, mp3, etc.)
}
</code></pre>
<p>In this last extract of the definition file, I'm defining the Protocol Buffer to serialize the data that the service will request and return as a response. I can also set up a set of functions that the RFID service can call. Using the <code>grpc_tools</code> Python package, I've compiled the Protocol Buffer definition into Python classes to be used in the code.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> src.grpc_generated <span class="hljs-keyword">import</span> audio_service_pb2, audio_service_pb2_grpc
<span class="hljs-keyword">from</span> src.predictor.predict <span class="hljs-keyword">import</span> AudioPredictor


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AudioService</span>(<span class="hljs-params">audio_service_pb2_grpc.AudioServiceServicer</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-comment"># Initialize the predictor</span>
        model_path = os.path.join(
                                  <span class="hljs-string">"exported_models"</span>, <span class="hljs-string">"model.onnx"</span>)
        print(<span class="hljs-string">f"Loading model from: <span class="hljs-subst">{model_path}</span>"</span>)
        self.predictor = AudioPredictor(model_path, feature_type=<span class="hljs-string">"melspectrogram"</span>)

       ...

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">StartAudioProcessing</span>(<span class="hljs-params">self, request, context</span>):</span>
        <span class="hljs-string">"""Start audio recording and processing"""</span>
        session_id = request.session_id <span class="hljs-keyword">or</span> str(uuid.uuid4())

        <span class="hljs-comment"># Process audio code with AudioPredictor...</span>

        <span class="hljs-keyword">return</span> audio_service_pb2.AudioResponse(
                session_id=session_id,
                success=<span class="hljs-literal">True</span>,
                predicted_class=predicted_class,
                confidence=float(confidence),
                top_predictions=top_predictions
            )
</code></pre>
<p>Each function defined in the Protocol Buffer must be implemented in the class that extends the AudioServiceServicer class. These functions should accept and return the expected Protocol Buffer types.</p>
<p>The same Protocol Buffer definition and gRPC auto-generated Python classes are used in the RFID service. A client is set up to connect and call the functions to communicate with the audio service.</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AudioServiceClient</span>:</span>
    <span class="hljs-string">"""gRPC client for Audio Service"""</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, server_address: str = None, timeout: int = <span class="hljs-number">30</span></span>):</span>
        <span class="hljs-comment"># Use environment variable or default to Docker service name</span>
        <span class="hljs-keyword">if</span> server_address <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
            server_address = os.environ.get(<span class="hljs-string">'AUDIO_SERVICE_URL'</span>, <span class="hljs-string">'localhost:50051'</span>)

        self._connect()

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_connect</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-string">"""Establish connection to audio service"""</span>
            self.logger.info(<span class="hljs-string">f"Attempting to connect to audio service at <span class="hljs-subst">{self.server_address}</span>"</span>)
            self.channel = grpc.insecure_channel(self.server_address)
            self.stub = audio_service_pb2_grpc.AudioServiceStub(self.channel)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">start_audio_processing</span>(<span class="hljs-params">self, duration: int = <span class="hljs-number">5</span>, session_id: Optional[str] = None</span>) -&gt; Optional[Dict]:</span>
        <span class="hljs-string">"""
        Start audio recording and processing
        """</span>
            <span class="hljs-keyword">if</span> session_id <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
                session_id = str(uuid.uuid4())

            request = audio_service_pb2.AudioRequest(
                session_id=session_id,
                recording_duration=duration,
                output_format=<span class="hljs-string">"wav"</span>
            )

            response = self.stub.StartAudioProcessing(request, timeout=self.timeout)
</code></pre>
<h2 id="heading-final-integrated-result">Final Integrated Result</h2>
<p>The RFID service will wait for the RFID swipe. Once this happens, it will create a new thread using the <code>threading</code> package. This thread requests the Audio service to process audio through gRPC until the RFID tag is swiped again. These separate threads are created to ensure that RFID tag swipe detection is not blocked by the main thread.</p>
<p>The following video demonstrates the integrated result. You can find the end-to-end testing execution in the <a target="_blank" href="https://docs.google.com/spreadsheets/d/e/2PACX-1vSYmIEgCyzeMgoAVpV2uuvP_5tXgmkRH6a3YgAs-i6lea1IeYeNCrSGrvv3DSJGfw/pubhtml">End to end testing format</a>.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/reBfdSbLQlM">https://youtu.be/reBfdSbLQlM</a></div>
<p> </p>
<p>As the test execution log explains, the RFID swipe stress test is failing. Swiping the RFID reader repeatedly creates new threads and causes overlaps in predictions. Ideally, the audio processing thread should finish before starting a new one.</p>
<h2 id="heading-known-limitations">Known Limitations</h2>
<ol>
<li>The plan was to use Docker containers for the services, with the Raspberry Pi managing the initialization of these containers. However, because containers operate separately from the Raspberry Pi environment, setting up the audio hardware and GPIO pins in the containers proved to be complicated. As a result, the idea was abandoned in favor of running the Python scripts and starting the servers on the local machine.</li>
</ol>
<h2 id="heading-whats-next">What’s Next?</h2>
<ol>
<li><p>To enhance the quality and maintainability of the codebase, the next step involves implementing a linting tool integrated with a GitHub Actions (GHA) workflow.</p>
<ol>
<li>This tool will automatically check the code for adherence to coding standards, ensuring consistency and promoting best practices across the entire project.</li>
</ol>
</li>
<li><p>Begin by implementing the IMU (Inertial Measurement Unit) Service, which is designed to capture detailed IMU data from the mobile device. This service will be responsible for collecting various types of sensor data, including accelerometer, gyroscope, and other readings.</p>
</li>
<li><p>As shown by E2E testing, we need a stronger RFID swipe logic to pass the stress execution tests.</p>
</li>
</ol>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>This sprint marked a pivotal step in bridging the hardware and software layers of the HAR system. From soldering and assembling the sensor layout to implementing abstraction layers and wiring services with gRPC, the project is now equipped with a solid, extensible foundation. The successful integration between RFID and audio modules shows that even in constrained environments, it's possible to build smart, responsive systems with strong architectural principles.</p>
<p>The upcoming sprints will push this foundation further—by incorporating mobile-based IMU data, refining audio recognition, and reinforcing system performance under load.</p>
<p>Thanks for following along. Feel free to explore the repositories, share feedback, or connect if you're navigating similar challenges in edge AI, IoT, or smart environments.</p>
<p>📬 <em>Follow this series on</em> <a target="_blank" href="https://typo.hashnode.dev"><em>typo.hashnode.dev</em></a> <em>to see how this HAR system evolves from prototype to production-ready.</em></p>
]]></content:encoded></item><item><title><![CDATA[Privacy-Aware Multimodal HAR System on the Edge]]></title><description><![CDATA[Introduction
As part of my Computer Science dissertation, I’m developing a multi-resident, multimodal Human Activity Recognition (HAR) system tailored for privacy-sensitive environments like shared kitchens. This system is designed to operate on reso...]]></description><link>https://blog.rodrigocaballero.net/audio-recognition-rbpi</link><guid isPermaLink="true">https://blog.rodrigocaballero.net/audio-recognition-rbpi</guid><category><![CDATA[AI]]></category><category><![CDATA[edgecomputing]]></category><category><![CDATA[Python]]></category><category><![CDATA[Raspberry Pi]]></category><category><![CDATA[prototyping]]></category><dc:creator><![CDATA[Rodrigo Caballero]]></dc:creator><pubDate>Sat, 21 Jun 2025 15:02:38 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1750517866566/e9a236cf-570e-4da7-a95d-c52fcefadea7.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>As part of my Computer Science dissertation, I’m developing a <strong>multi-resident, multimodal Human Activity Recognition (HAR)</strong> system tailored for privacy-sensitive environments like shared kitchens. This system is designed to operate on <strong>resource-constrained edge devices</strong>, using a mix of <strong>low-resolution audio, IMU sensors, and RFID tags</strong>.</p>
<p>You can find a more detailed literature review and the motivation behind the project at this <a target="_blank" href="https://docs.google.com/document/d/e/2PACX-1vS-NGYzTKVmEbb3WYTKFKBOuVdx8XMZpv8Okhm3qRC49KtE6BTFuWHqIf13jEe0TMhJSbambIdoPFVH/pub">link</a>.</p>
<p>In this first installment of the series, I’ll walk you through the <strong>prototype implementation</strong> of one of the most challenging features: <strong>audio-based activity recognition</strong> on a Raspberry Pi.</p>
<h2 id="heading-prototype-architecture">Prototype Architecture</h2>
<p>The feature prototype focuses on processing environmental audio to predict activity using a <strong>CNN model</strong> trained on the <a target="_blank" href="https://github.com/marc-moreaux/kitchen20">Kitchen20</a> dataset. Kitchen20 is a rich dataset for environmental audio in kitchen settings.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750351849169/782ad7a5-1120-4e17-8cc5-55e4058fedbc.png" alt class="image--center mx-auto" /></p>
<p>Key components:</p>
<ul>
<li><p><strong>Raspberry Pi 4 Model B (1GB RAM)</strong></p>
<ul>
<li>Although this model of the Raspberry Pi can come in higher versions of RAM, I decided to use the lowest version possible so the project is aware and built-upon a resource-constrained device to aim for efficiency and cost-effective solutions on HAR.</li>
</ul>
</li>
<li><p><strong>USB omnidirectional microphone</strong></p>
</li>
<li><p><strong>ONNX exported model</strong></p>
</li>
<li><p><strong>Google TTS for audio output</strong></p>
</li>
</ul>
<h2 id="heading-model-training-with-kitchen20">Model Training with Kitchen20</h2>
<p>The original PyTorch implementation of Kitchen20 relied on outdated <code>torchaudio</code> APIs. This is evident in the following extraction of the code:</p>
<pre><code class="lang-python">audio_set = Kitchen20(
        root=<span class="hljs-string">'/media/data/dataest/kitchen20/'</span>,
        folds=[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>],
        transforms=transforms.Compose([ <span class="hljs-comment"># transforms.Compose is no longer available in torchaudio</span>
            transforms.RandomStretch(<span class="hljs-number">1.25</span>),
            transforms.Scale(<span class="hljs-number">2</span> ** <span class="hljs-number">16</span> / <span class="hljs-number">2</span>),
            transforms.Pad(input_length // <span class="hljs-number">2</span>),
            transforms.RandomCrop(input_length),
            transforms.RandomOpposite()]),
        overwrite=<span class="hljs-literal">False</span>,
        use_bc_learning=<span class="hljs-literal">False</span>,
        audio_rate=audio_rate)

    audio_loader = DataLoader(audio_set, batch_size=<span class="hljs-number">2</span>,
                              shuffle=<span class="hljs-literal">True</span>, num_workers=<span class="hljs-number">4</span>)
</code></pre>
<p>The use of the Compose function for audio transformations was available in the very first release version of <code>torchaudio</code>, which was removed as a breaking change on the next release of <code>torchaudio</code>, <a target="_blank" href="https://github.com/pytorch/audio/releases/tag/v0.3.0">version 0.3.0</a></p>
<p>I re-implemented the dataset accessor using modern <code>torch</code> and <code>torchaudio</code> versions, which you can find on the Github public repository:🔗 <a target="_blank" href="https://github.com/RodCaba/fp-audio-service/tree/master/lib/kitchen20-pytorch">Kitchen20 PyTorch Accessor (Updated)</a></p>
<p>I trained a <strong>four-layer CNN</strong>, then exported the model to <strong>ONNX format</strong> for lightweight inference on the edge device. The current results are:</p>
<ul>
<li><p><strong>Approximately 30% accuracy</strong> on the training data: The current model achieves around 30% accuracy when evaluated on the training dataset. This indicates that while the model is learning to some extent, there is significant room for improvement. The relatively low accuracy suggests that the model may not be capturing all the necessary patterns in the data effectively.</p>
</li>
<li><p>Plans for improvement include <strong>integrating IMU and RFID modalities</strong>, as well as refining preprocessing: To enhance the model's performance, we plan to incorporate additional data sources, such as IMU (Inertial Measurement Unit) and RFID (Radio Frequency Identification) modalities. These additional data streams can provide more context and features for the model to learn from, potentially leading to better accuracy.</p>
</li>
</ul>
<h2 id="heading-preprocessing-pipeline">Preprocessing Pipeline</h2>
<p>The preprocessing pipeline of audio includes:</p>
<ol>
<li><p>Sample rate adjustment</p>
<pre><code class="lang-python"> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">preprocess_audio</span>(<span class="hljs-params">
       waveform,
       original_sample_rate,
       target_sample_rate=<span class="hljs-number">16000</span>,
       target_length=<span class="hljs-number">4</span>,
 </span>):</span>
     <span class="hljs-string">"""
     Preprocess the audio waveform to have a consistent length and sample rate.

     Args:
         waveform (Tensor): The audio waveform.
         original_sample_rate (int): The original sample rate of the audio.
         target_sample_rate (int, optional): The target sample rate. Defaults to 16000.
         target_length (int, optional): Target number of samples.

     Returns:
         Tensor: The preprocessed audio waveform.
     """</span>
     <span class="hljs-comment"># Convert to mono if stereo</span>
     <span class="hljs-keyword">if</span> waveform.size(<span class="hljs-number">0</span>) &gt; <span class="hljs-number">1</span>:
         waveform = torch.mean(waveform, dim=<span class="hljs-number">0</span>, keepdim=<span class="hljs-literal">True</span>)

     <span class="hljs-comment"># Resample the audio if the sample rate is different</span>
     <span class="hljs-keyword">if</span> original_sample_rate != target_sample_rate:
         waveform = torchaudio.transforms.Resample(
             orig_freq=original_sample_rate,
             new_freq=target_sample_rate
         )(waveform)

     <span class="hljs-comment"># Adjust length</span>
     current_length = waveform.shape[<span class="hljs-number">1</span>]
     <span class="hljs-comment"># Trim or pad the waveform to the target length</span>
     <span class="hljs-keyword">if</span> current_length &gt; target_length:
         waveform = waveform[:, :target_length]
     <span class="hljs-keyword">else</span>:
         waveform = torch.nn.functional.pad(waveform, (<span class="hljs-number">0</span>, target_length - current_length))

     <span class="hljs-keyword">return</span> waveform
</code></pre>
</li>
<li><p>Feature extraction using Pytorch (as opposed to Moreaux, 2019 work that used Librosa library):</p>
<ul>
<li><p><strong>MFCC</strong></p>
</li>
<li><p><strong>MelSpectrogram</strong></p>
</li>
</ul>
</li>
</ol>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> feature_type == <span class="hljs-string">'melspectrogram'</span>:
            self.transform = torchaudio.transforms.MelSpectrogram(
                sample_rate=sample_rate,
                n_fft=n_fft,
                hop_length=hop_length,
                n_mels=n_mels,
                f_min=f_min,
                f_max=f_max,
            )
            self.db_transform = torchaudio.transforms.AmplitudeToDB()
<span class="hljs-keyword">elif</span> feature_type == <span class="hljs-string">'mfcc'</span>:
            self.transform = torchaudio.transforms.MFCC(
                sample_rate=sample_rate,
                n_mfcc=n_mfcc,
                melkwargs={
                    <span class="hljs-string">'n_fft'</span>: n_fft,
                    <span class="hljs-string">'hop_length'</span>: hop_length,
                    <span class="hljs-string">'n_mels'</span>: n_mels,
                    <span class="hljs-string">'f_min'</span>: f_min,
                    <span class="hljs-string">'f_max'</span>: f_max,
                }
            )
            self.db_transform = <span class="hljs-literal">None</span>
</code></pre>
<p>This preprocessing service is shared between the <strong>cloud (for training)</strong> and the <strong>edge (for prediction)</strong> layers to ensure consistent input formats. Some of the work that is considered to improve the preprocessing pipeline includes:</p>
<ul>
<li><p>Refactor the existing codebase to function as a <strong>standalone service</strong>. This involves decoupling the preprocessing logic from the main application, allowing it to operate independently.</p>
</li>
<li><p>Extend the current preprocessing capabilities by adding support for <strong>IMU</strong> and <strong>RFID</strong> data. This will involve developing new transformation pipelines tailored to the specific characteristics of IMU and RFID data.</p>
</li>
<li><p>Conduct a thorough parameter tuning process to optimize the system's performance. This is probably the most important improvement needed during the project's development phase. Due to time limits, the feature prototype was created quickly to show that the ONNX model could work with the edge device and kitchen environment, without focusing much on the preprocessing service details. The first goal was to copy the work of Moreaux et al. (2019) with minimal changes to make it run on modern libraries, mainly fixing compile and runtime errors. I believe this is why the model's accuracy and confidence are low. I need to explore and find the best audio preprocessing practices for CNN to greatly improve this key part of the project.</p>
</li>
</ul>
<h2 id="heading-edge-device-inference-loop">Edge Device Inference Loop</h2>
<p>The Raspberry Pi (edge device layer) performs the following in a loop:</p>
<ol>
<li><p>Records 5 seconds of audio</p>
</li>
<li><p>Uses the shared preprocessing pipeline</p>
</li>
<li><p>Feeds features to the ONNX model using <code>onnxruntime</code></p>
</li>
<li><p>Outputs:</p>
<ul>
<li><p>Prediction label</p>
</li>
<li><p>Confidence score</p>
</li>
</ul>
</li>
<li><p>Plays the outputs using an audible feedback using <code>gTTS</code> + <code>playsound</code></p>
</li>
</ol>
<p>The following short video, showcase the result of the prototype.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/LX5LW3Y-ut4">https://youtu.be/LX5LW3Y-ut4</a></div>
<p> </p>
<p>This setup confirms <strong>feasibility</strong> of real-time HAR prediction on <strong>low-power hardware</strong>, crucial for assisted living scenarios where <strong>privacy and efficiency</strong> are essential. The improvements to be made on the edge device layer includes:</p>
<ul>
<li><p>Replace the full operating system with a <strong>lightweight Linux distribution</strong> like Alpine Linux or Raspberry Pi OS Lite. This change will use fewer system resources, speed up boot times, and improve overall performance by cutting down on unnecessary background processes and services.</p>
</li>
<li><p>Automate updating the model from the cloud using <strong>SSH and ONNX replacement</strong>. This will keep the device running the latest model version, improving prediction accuracy and reliability. By securely connecting to the cloud, updates can happen automatically without needing manual work, reducing downtime and maintenance efforts.</p>
</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Developing a privacy-focused Human Activity Recognition (HAR) system on limited-resource edge devices is a promising way to improve privacy and efficiency in shared spaces. By using low-resolution audio, IMU sensors, and RFID tags, this system aims to recognize activities accurately while keeping user privacy intact. The prototype on a Raspberry Pi shows that real-time HAR prediction is possible on low-power hardware. Although the current model's accuracy is moderate, ongoing improvements, like adding more data types and improving preprocessing, should boost performance. Future work will focus on optimizing system design, organizing datasets, and automatic deployments on edge devices to make the system practical and usable in real-world situations. This project aims as a step toward creating energy-efficient and privacy-aware HAR solutions for edge devices.</p>
<h2 id="heading-code-amp-resources">Code &amp; Resources</h2>
<ul>
<li><p>🔗 <a target="_blank" href="https://github.com/RodCaba/fp-audio-service">GitHub Repo: fp-audio-service</a> - Modern Pytorch accessor for the Kitchen20 dataset.</p>
</li>
<li><p>🔗 <a target="_blank" href="https://github.com/marc-moreaux/kitchen20">Kitchen20 Dataset</a> - Moreaux et al. (2019) original Kitchen20 implementation.</p>
</li>
<li><p>🧠 <a target="_blank" href="https://docs.google.com/document/d/e/2PACX-1vS-NGYzTKVmEbb3WYTKFKBOuVdx8XMZpv8Okhm3qRC49KtE6BTFuWHqIf13jEe0TMhJSbambIdoPFVH/pub">Literature Review and project motivations</a></p>
</li>
</ul>
<h2 id="heading-follow-the-series">Follow the Series</h2>
<p>This post is part of a series documenting my dissertation project on <strong>Energy-Efficient and Privacy-Aware Multimodal HAR for Edge Devices</strong>. Future posts will cover:</p>
<ol>
<li><p><strong>System Design &amp; Architecture</strong>: Tiers, sensors, and design principles</p>
</li>
<li><p><strong>Dataset Structuring</strong>: From Kitchen20 to custom multi-resident datasets</p>
</li>
<li><p><strong>Edge Device Deployment</strong>: Optimization, updates, and latency analysis</p>
</li>
<li><p><strong>Final Evaluation &amp; Results</strong>: Precision, recall, and real-world usability</p>
</li>
</ol>
<p>Subscribe to <a target="_blank" href="https://typo.hashnode.dev">typo.hashnode.dev</a> and follow along!</p>
]]></content:encoded></item><item><title><![CDATA[Implementing a MaxHeap in C++]]></title><description><![CDATA[Introduction
In my first article, I want to explore a Computer Science topic: the heap data structure.
These kinds of theoretical topics are pretty common in technical interviews, and although I have never used them in my day-to-day job, the applicat...]]></description><link>https://blog.rodrigocaballero.net/implementing-a-maxheap-in-c</link><guid isPermaLink="true">https://blog.rodrigocaballero.net/implementing-a-maxheap-in-c</guid><category><![CDATA[C++]]></category><category><![CDATA[General Programming]]></category><category><![CDATA[Computer Science]]></category><category><![CDATA[algorithms]]></category><category><![CDATA[data structures]]></category><dc:creator><![CDATA[Rodrigo Caballero]]></dc:creator><pubDate>Tue, 05 Sep 2023 20:42:57 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1693945693133/f7522cfc-0e9f-49ef-a387-108509ecc41d.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-introduction">Introduction</h3>
<p>In my first article, I want to explore a Computer Science topic: the heap data structure.</p>
<p>These kinds of theoretical topics are pretty common in technical interviews, and although I have never used them in my day-to-day job, the applications of these data structures are exciting and worth studying to improve your problem-solving skills.</p>
<p>In this article, I will explain what a heap is, its uses and how to implement it in C++.</p>
<h3 id="heading-what-is-a-heap">What is a heap?</h3>
<p>A heap is a tree data structure that satisfies 2 properties:</p>
<ul>
<li><p>It is a complete binary tree</p>
</li>
<li><p>The value of a node is greater (max heap) or smaller (min-heap) than or equal to the value of its children</p>
</li>
</ul>
<p>An example of a max heap would be the following:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1693884018078/38142f52-58e7-444d-a397-69ed78217bc9.png" alt class="image--center mx-auto" /></p>
<p>The example above is both:</p>
<ul>
<li><p>A complete binary tree:</p>
<ul>
<li><p>All levels are filled except possibly the last level.</p>
</li>
<li><p>The last level has all keys as left as possible.</p>
</li>
</ul>
</li>
<li><p>A max heap:</p>
<ul>
<li>The value of a node is greater than or equal to the value of its children.</li>
</ul>
</li>
</ul>
<h3 id="heading-uses-of-a-heap">Uses of a heap</h3>
<p>A useful property of the heap is that the root node is always the tree's maximum (max heap) or minimum (min-heap) value.</p>
<p>This property becomes handy to implement a priority queue, where the root node is always the next element to be processed.</p>
<p>Priority queues are useful in many applications, for example:</p>
<ul>
<li>A CPU scheduler, where the tasks with the highest priority are processed first.</li>
<li>A web server, where the requests with the highest priority are processed first.</li>
</ul>
<p>Another use of the heap is to implement a sorting algorithm. The idea is to insert all the elements in the heap and then extract them individually. The result will be a sorted ascending list (min-heap) or descending list (max heap).</p>
<h3 id="heading-heap-representation">Heap Representation</h3>
<p>The most common way to represent a heap is through an array.</p>
<p>Thanks to the properties of a complete binary tree, the root node is the first element of the array, and its children are the second and third elements. Then, the children of the second element are the fourth and fifth elements, and so on.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1693884139028/d6b71d63-e498-4360-8acc-f11816a9c494.png" alt class="image--center mx-auto" /></p>
<p>We get the children of a node with the following formulas, assuming that the array is 0-indexed:</p>
<p>$$left(i) = (2 * i) + 1$$</p><p>$$right(i) = (2 * i) + 2$$</p><p>To get the parent of a node we use the following formula:</p>
<p>$$parent(i) = FLOOR(\frac{i-1}{2})$$</p><h3 id="heading-implementation">Implementation</h3>
<p>We will assume a max heap implementation in this article. Let's imagine we are constructing a priority queue for an application that processes tasks. Each task has a priority and we want to process the tasks with the highest priority first.</p>
<p>This will be our Heap class structure:</p>
<pre><code class="lang-c++"><span class="hljs-comment">// MaxHeap.h</span>
<span class="hljs-meta">#inclde <span class="hljs-meta-string">&lt;string&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;vector&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cmath&gt;</span></span>

<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">Task</span> {</span>
    <span class="hljs-keyword">int</span> priority;
    <span class="hljs-built_in">std</span>::<span class="hljs-built_in">string</span> id;
};

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">MaxHeap</span> {</span>
  <span class="hljs-keyword">public</span>:
    MaxHeap(){
        heap = <span class="hljs-built_in">std</span>::<span class="hljs-built_in">vector</span>&lt;Task&gt;();
    }
    ~MaxHeap();
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">insert</span><span class="hljs-params">(Task task)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">isEmpty</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function">Task <span class="hljs-title">extractMax</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">static</span> <span class="hljs-keyword">int</span> <span class="hljs-title">getLeft</span><span class="hljs-params">(<span class="hljs-keyword">int</span> i)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">static</span> <span class="hljs-keyword">int</span> <span class="hljs-title">getRight</span><span class="hljs-params">(<span class="hljs-keyword">int</span> i)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">static</span> <span class="hljs-keyword">int</span> <span class="hljs-title">getParent</span><span class="hljs-params">(<span class="hljs-keyword">int</span> i)</span></span>;
  <span class="hljs-keyword">private</span>:
    <span class="hljs-built_in">std</span>::<span class="hljs-built_in">vector</span>&lt;Task&gt; heap;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">swap</span><span class="hljs-params">(<span class="hljs-keyword">int</span> i, <span class="hljs-keyword">int</span> j)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> <span class="hljs-title">highestIndex</span><span class="hljs-params">(<span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> i)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">heapify</span><span class="hljs-params">(<span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> i)</span></span>;
}
</code></pre>
<p>In the above code, we have a Task struct containing the task's priority and id.</p>
<p>The <code>MinHeap</code> class consists of the following:</p>
<ul>
<li><p>3 public methods that allow other classes to insert a task, extract the task with the highest priority and check if the heap is empty.</p>
</li>
<li><p>3 static methods, that allow to get the children and parent of a node.</p>
<ul>
<li>These methods are static because they don't need to access any <em>state</em> of the Heap class, they just need the index of the node.</li>
</ul>
</li>
<li><p>A private vector of tasks will be used to store the tasks in the heap.</p>
<ul>
<li>This vector is initialized in the constructor of the class.</li>
</ul>
</li>
<li><p>3 private methods, that the public methods will use to insert and extract tasks from the heap.</p>
</li>
</ul>
<p>We also import the <code>vector</code>, <code>string</code> and <code>cmath</code> libraries. The <code>vector</code> library will be used to store the tasks in the heap, the <code>string</code> library to store the ID of the task and the <code>cmath</code> library to use the floor function.</p>
<p>Let's start the implementation of static methods to get the children and parent of a node, this will be pretty straightforward, as we just need to apply the formulas that we saw before:</p>
<pre><code class="lang-c++"><span class="hljs-comment">// MaxHeap.cpp</span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">"MaxHeap.h"</span></span>

<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">MaxHeap::getLeft</span><span class="hljs-params">(<span class="hljs-keyword">int</span> i)</span> </span>{
    <span class="hljs-keyword">return</span> (<span class="hljs-number">2</span> * i) + <span class="hljs-number">1</span>;
}

<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">MaxHeap::getRight</span><span class="hljs-params">(<span class="hljs-keyword">int</span> i)</span> </span>{
    <span class="hljs-keyword">return</span> (<span class="hljs-number">2</span> * i) + <span class="hljs-number">2</span>;
}

<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">MaxHeap::getParent</span><span class="hljs-params">(<span class="hljs-keyword">int</span> i)</span> </span>{
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">floor</span>((i - <span class="hljs-number">1</span>) / <span class="hljs-number">2</span>);
}
</code></pre>
<p>Now, let's take a look at the private functions that will be used for the insertion and extraction of the tasks:</p>
<pre><code class="lang-c++"><span class="hljs-comment">// MaxHeap.cpp</span>

<span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">MaxHeap::swap</span><span class="hljs-params">(<span class="hljs-keyword">int</span> i, <span class="hljs-keyword">int</span> j)</span> </span>{
    Task temp = heap[i];
    heap[i] = heap[j];
    heap[j] = temp;
}

<span class="hljs-function"><span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> <span class="hljs-title">MaxHeap::highestIndex</span><span class="hljs-params">(<span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> i)</span> </span>{
    <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> highest = i;
    <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> left = MinHeap::getLeft(i);

    <span class="hljs-comment">// No left child, hence no right child, return parent</span>
    <span class="hljs-keyword">if</span> (left &gt;= heap.size()) {
        <span class="hljs-keyword">return</span> highest;
    }

    <span class="hljs-keyword">if</span> (heap[left].priority &gt; heap[highest].priority) {
        highest = left;
    }

    <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> right = MinHeap::getRight(i);

    <span class="hljs-keyword">if</span> (right &lt; heap.size() &amp;&amp; heap[right].priority &gt; heap[highest].priority) {
        highest = right;
    }

    <span class="hljs-keyword">return</span> highest;
}

<span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">MaxHeap::heapify</span><span class="hljs-params">(<span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> i)</span> </span>{
    <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> highest = highestIndex(i);
    <span class="hljs-keyword">if</span> (highest != i) {
        swap(i, highest);
        heapify(highest);
    }
}
</code></pre>
<p>Let's go through the code above:</p>
<p>The <code>swap</code> function swaps the position of two elements in the heap, given their indexes.</p>
<p>The <code>highestIndex</code> function returns the index of the highest priority element between the node with index i and its children. If the node with index i has no children, it returns the index of the node with index i.</p>
<p>Finally, the <code>heapify</code> function takes a node with an index <code>i</code> and swaps it with its highest-priority child if the child has a higher priority than the node. It then calls itself recursively with the index of the child, this will ensure that the heap property is satisfied from the node with index <code>i</code> all the way to the leaf nodes.</p>
<p>Believe it or not, that's as complicated as it gets. Now that we have the static methods and the private methods, we can implement the public methods that will be used to insert and extract tasks from the heap.</p>
<pre><code class="lang-c++"><span class="hljs-comment">// MaxHeap.cpp</span>

<span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">MaxHeap::insert</span><span class="hljs-params">(Task task)</span></span>{
  <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">int</span> pos = heap.size();
  heap.push_back(task);
  <span class="hljs-keyword">while</span> (pos &gt; <span class="hljs-number">0</span> &amp;&amp; heap[MaxHeap::getParent(pos)].priority &lt; heap[pos].priority) {
    swap(pos, MaxHeap::getParent(pos));
    pos = MaxHeap::getParent(pos);
  }
}

<span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">MaxHeap::isEmpty</span><span class="hljs-params">()</span> </span>{
  <span class="hljs-keyword">return</span> heap.size() == <span class="hljs-number">0</span>;
}

<span class="hljs-function">Task <span class="hljs-title">MaxHeap::extractMax</span><span class="hljs-params">()</span> </span>{
  <span class="hljs-keyword">if</span> (isEmpty()) {
    <span class="hljs-keyword">throw</span> <span class="hljs-string">"Heap is empty"</span>;
  }
  Task max{ heap[<span class="hljs-number">0</span>].priority, heap[<span class="hljs-number">0</span>].id };
  heap.erase(heap.begin());

  <span class="hljs-keyword">if</span> (isEmpty()) <span class="hljs-keyword">return</span> max; 

  heapify(<span class="hljs-number">0</span>);
  <span class="hljs-keyword">return</span> max;
}
</code></pre>
<p>The <code>insert</code> method takes a task and inserts it in the heap. It first inserts the task at the end of the heap and then swaps it with its parent until the heap property is satisfied.</p>
<p>The <code>isEmpty</code> method returns true if the heap is empty, and false otherwise.</p>
<p>The <code>extractMax</code> method returns the task with the highest priority and removes it from the heap. It first checks if the heap is empty, and if it is, it throws an exception. Then it stores the task with the highest priority in a variable, removes it from the heap and calls the <code>heapify</code> method with the root node.</p>
<p>Now that we have implemented the heap, we can use it in our application:</p>
<pre><code class="lang-c++"><span class="hljs-comment">// main.cpp</span>

<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;iostream&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">"MaxHeap.h"</span></span>

<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">main</span><span class="hljs-params">()</span> </span>{
    MaxHeap heap = MaxHeap();
    heap.insert(Task{ <span class="hljs-number">1</span>, <span class="hljs-string">"Task 1"</span> });
    heap.insert(Task{ <span class="hljs-number">2</span>, <span class="hljs-string">"Task 2"</span> });
    heap.insert(Task{ <span class="hljs-number">3</span>, <span class="hljs-string">"Task 3"</span> });

    <span class="hljs-keyword">while</span> (!heap.isEmpty()) {
        Task task = heap.extractMax();
        <span class="hljs-built_in">std</span>::<span class="hljs-built_in">cout</span> &lt;&lt; <span class="hljs-string">"Task with id "</span> &lt;&lt; task.id &lt;&lt; <span class="hljs-string">" and priority "</span> &lt;&lt; task.priority &lt;&lt; <span class="hljs-string">" extracted"</span> &lt;&lt; <span class="hljs-built_in">std</span>::<span class="hljs-built_in">endl</span>;
    }

    <span class="hljs-comment">// Output:</span>
    <span class="hljs-comment">// Task with id Task 3 and priority 3 extracted</span>
    <span class="hljs-comment">// Task with id Task 2 and priority 2 extracted</span>
    <span class="hljs-comment">// Task with id Task 1 and priority 1 extracted</span>

    <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;
}
</code></pre>
<h3 id="heading-conclusion">Conclusion</h3>
<p>In this article, we have seen what a heap is, its uses and how to implement a max heap in C++.</p>
<p>As I stated, these topics allow you to improve your problem-solving skills. It allows you to think about simple solutions to complex problems, and it is a good way to practice your programming skills.</p>
<p>I hope you enjoyed this article, and if you have any questions or suggestions, please let me know in the comments below.</p>
]]></content:encoded></item></channel></rss>