<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Justin Miller</title><link>https://justinrmiller.github.io/</link><description>Recent content on Justin Miller</description><generator>Hugo -- gohugo.io</generator><language>en-US</language><managingEditor>justinrmiller@gmail.com (Justin Miller)</managingEditor><webMaster>justinrmiller@gmail.com (Justin Miller)</webMaster><lastBuildDate>Sun, 12 Apr 2026 12:00:00 -0700</lastBuildDate><atom:link href="https://justinrmiller.github.io/index.xml" rel="self" type="application/rss+xml"/><item><title>OpenSearch vs LanceDB for Vector Search: Performance and Cost</title><link>https://justinrmiller.github.io/opensearch-vs-lancedb-for-vector-search-performance-and-cost/</link><pubDate>Sun, 12 Apr 2026 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/opensearch-vs-lancedb-for-vector-search-performance-and-cost/</guid><description>&lt;p&gt;Choosing a vector database usually comes down to a tradeoff between a full search service and an in-process library. OpenSearch and LanceDB sit on opposite ends of that spectrum: one runs as a distributed cluster, the other as an embedded file format you query directly from your application. This post benchmarks both on the same workload (287,360 COCO 2017 images embedded with SigLIP 2) measuring ingestion throughput, query latency, storage layout, and AWS cost.&lt;/p&gt;</description></item><item><title>Building an E2E Encrypted Chat Application with LanceDB and libsodium</title><link>https://justinrmiller.github.io/building-an-e2e-encrypted-chat-application-with-lancedb-and-libsodium/</link><pubDate>Sat, 28 Mar 2026 00:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/building-an-e2e-encrypted-chat-application-with-lancedb-and-libsodium/</guid><description>&lt;p&gt;Building a chat application where the server never sees plaintext messages requires careful coordination between cryptographic primitives, real-time delivery, and persistent storage. This post examines Seal, an end-to-end encrypted chat application that pairs LanceDB for zero-infrastructure storage with libsodium for audited, high-performance cryptography. The goal is to make it as easy as possible to deploy and inexpensive to run through object storage.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://justinrmiller.github.io/assets/seal-screenshot.png" alt="Seal Screenshot"&gt;&lt;/p&gt;
&lt;h2 id="introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;End-to-end encrypted messaging systems face a fundamental tension: the server must store and relay messages it cannot read. This creates challenges that don&amp;rsquo;t exist in traditional chat applications:&lt;/p&gt;</description></item><item><title>Building a Ray Data Integration for Apache Kafka with confluent-kafka</title><link>https://justinrmiller.github.io/building-a-ray-data-integration-for-apache-kafka-with-confluent-kafka/</link><pubDate>Sat, 20 Dec 2025 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/building-a-ray-data-integration-for-apache-kafka-with-confluent-kafka/</guid><description>&lt;p&gt;Streaming data to Kafka from distributed data processing pipelines is a common pattern, but the integration details matter. This post examines a Ray Data sink for Kafka built on confluent-kafka, leveraging the high-performance librdkafka library for efficient distributed writes.&lt;/p&gt;
&lt;h2 id="the-problem"&gt;The Problem&lt;/h2&gt;
&lt;p&gt;Writing Ray Datasets to Kafka presents several challenges that don&amp;rsquo;t align cleanly with typical batch processing patterns:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Converting DataFrames to Kafka message formats (key-value pairs)&lt;/li&gt;
&lt;li&gt;Choosing appropriate serialization strategies (JSON, strings, bytes)&lt;/li&gt;
&lt;li&gt;Managing asynchronous delivery and callbacks&lt;/li&gt;
&lt;li&gt;Handling producer buffer overflow in high-throughput scenarios&lt;/li&gt;
&lt;li&gt;Batching efficiently without blocking on network I/O&lt;/li&gt;
&lt;li&gt;Managing message keys for partitioning and compaction&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Without careful design, you end up with inefficient writes, memory issues, or silent data loss.&lt;/p&gt;</description></item><item><title>Building Ray Data Integrations for Vector Databases</title><link>https://justinrmiller.github.io/building-ray-data-integrations-for-vector-databases/</link><pubDate>Tue, 08 Jul 2025 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/building-ray-data-integrations-for-vector-databases/</guid><description>&lt;p&gt;Vector databases like Turbopuffer are becoming essential for AI applications, but integrating them can be a challenge if careful considerations around batching and performance aren&amp;rsquo;t made. This post examines an integration that handles the details of distributed vector operations.&lt;/p&gt;
&lt;h2 id="the-problem"&gt;The Problem&lt;/h2&gt;
&lt;p&gt;Vector databases need specific data formats and batch strategies that may not align well with DataFrame operations. You end up dealing with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Converting between DataFrames and vector formats&lt;/li&gt;
&lt;li&gt;Optimizing batch sizes for network efficiency&lt;/li&gt;
&lt;li&gt;Handling failures in distributed environments&lt;/li&gt;
&lt;li&gt;Managing vector arrays and metadata together&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="architecture-datasource-and-datasink"&gt;Architecture: Datasource and Datasink&lt;/h2&gt;
&lt;p&gt;The Ray-Turbopuffer integration implements custom &lt;code&gt;TurbopufferDatasource&lt;/code&gt; and &lt;code&gt;TurbopufferDatasink&lt;/code&gt; classes that handle bidirectional data flow:&lt;/p&gt;</description></item><item><title>Interfacing Ray with internal libraries and services using Pydantic</title><link>https://justinrmiller.github.io/interfacing-ray-with-internal-libraries-and-services-using-pydantic/</link><pubDate>Wed, 11 Jun 2025 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/interfacing-ray-with-internal-libraries-and-services-using-pydantic/</guid><description>&lt;p&gt;Companies often develop internal libraries that expose inputs and outputs to functions via Pydantic. Whether your internal library is for machine learning, API clients, or data validation services, you&amp;rsquo;ll often need to convert large DataFrames into Pydantic objects and pass batches of them to libraries.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s how to do it efficiently using Ray Data.&lt;/p&gt;
&lt;h2 id="the-challenge"&gt;The Challenge&lt;/h2&gt;
&lt;p&gt;Many in-house Python libraries/services follow these patterns:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;ml_service&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict_batch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;PredictionRequest&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;PredictionResponse&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;validator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;validate_batch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;DataModel&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;DataValidation&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;api_client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;submit_batch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objects&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;APIRecord&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;APIResponses&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;But your data starts as DataFrames. This approach doesn&amp;rsquo;t scale if there&amp;rsquo;s any latency in the sequential processing of a batch:&lt;/p&gt;</description></item><item><title>Generating dbt Documentation with OpenAI and GitHub Actions</title><link>https://justinrmiller.github.io/generating-dbt-documentation-with-openai-and-github-actions/</link><pubDate>Mon, 21 Apr 2025 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/generating-dbt-documentation-with-openai-and-github-actions/</guid><description>&lt;p&gt;In this blog post, I’ll walk through how to &lt;strong&gt;automatically generate documentation for your dbt project using OpenAI&lt;/strong&gt;—completely integrated into your GitHub Actions pipeline. This process will analyze every dbt model and seed file and generate detailed DBT documentation in Markdown with OpenAI&amp;rsquo;s GPT 4o LLM that&amp;rsquo;s committed back to the pull request.&lt;/p&gt;
&lt;h3 id="technology-stack"&gt;Technology Stack&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;dbt:&lt;/strong&gt; The analytics engineering tool for transforming data in your warehouse.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OpenAI GPT4o (via API):&lt;/strong&gt; Used to infer and document dbt models with markdown output.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GitHub Actions:&lt;/strong&gt; Automates the entire flow on every pull request.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Typer CLI:&lt;/strong&gt; Simplifies local development of the script used to collect and submit prompt data.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="advantages-of-this-approach"&gt;Advantages of This Approach&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Automated Metadata Generation:&lt;/strong&gt; Removes the burden of writing and maintaining technical documentation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Contextual, Business-Friendly Summaries:&lt;/strong&gt; The model extracts intent, joins, lineage, and structure.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Seamless GitHub Integration:&lt;/strong&gt; Keeps docs in sync by auto-committing the output to the PR branch.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="prerequisites"&gt;Prerequisites&lt;/h3&gt;
&lt;p&gt;Before running the workflow, make sure the following are in place:&lt;/p&gt;</description></item><item><title>LLM Inference with the Airflow AI SDK and Ollama</title><link>https://justinrmiller.github.io/llm-inference-with-the-airflow-ai-sdk-and-ollama/</link><pubDate>Sun, 06 Apr 2025 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/llm-inference-with-the-airflow-ai-sdk-and-ollama/</guid><description>&lt;p&gt;In this blog post, I will demonstrate how the &lt;strong&gt;Airflow AI SDK&lt;/strong&gt; and &lt;strong&gt;Ollama&lt;/strong&gt; can be used to develop Airflow DAGs at no cost locally with Astro CLI. This allows for full end-to-end testing of the Airflow DAG without reaching out to a third-party LLM provider and for a broader range of model selection. The example takes a collection of product names, submits to Ollama a request with a system prompt designed to generate blog ideas, then prints out the blog ideas.&lt;/p&gt;</description></item><item><title>Fine Tuning Gemma 3 with Unsloth</title><link>https://justinrmiller.github.io/fine-tuning-gemma-3-with-unsloth/</link><pubDate>Sun, 23 Mar 2025 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/fine-tuning-gemma-3-with-unsloth/</guid><description>&lt;p&gt;In this post, I&amp;rsquo;ll walk through fine-tuning Gemma-3 using Hugging Face and the Unsloth library. This article is based off an Unsloth Colab notebook and the consolidated source code is available here: &lt;a href="https://github.com/justinrmiller/llm-fine-tuning/blob/main/gemma3/training/__main__.py"&gt;Gemma 3 Training with Unsloth&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;To fully use the provided source code (for model export to HF) you must be logged into Huggingface via &lt;code&gt;huggingface-cli&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id="technologies-used"&gt;Technologies Used&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Hugging Face:&lt;/strong&gt; Platform for hosting and training LLMs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Unsloth:&lt;/strong&gt; Library optimized for fast, efficient fine-tuning.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;TRL (Transformer Reinforcement Learning):&lt;/strong&gt; Provides SFTTrainer for supervised fine-tuning.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PyTorch:&lt;/strong&gt; Core framework for model computations.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="steps-involved"&gt;Steps Involved&lt;/h3&gt;
&lt;p&gt;The fine-tuning process involves several key steps:&lt;/p&gt;</description></item><item><title>Automating Ray Workflows with Astronomer and Ray on Vertex AI</title><link>https://justinrmiller.github.io/automating-ray-workflows-with-astronomer-and-ray-on-vertex-ai/</link><pubDate>Sun, 08 Dec 2024 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/automating-ray-workflows-with-astronomer-and-ray-on-vertex-ai/</guid><description>&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Orchestrating distributed workloads efficiently is essential to maintaining performance and cost effectiveness of modern data engineering and machine learning workflows. By combining &lt;a href="https://www.astronomer.io/"&gt;Astronomer&lt;/a&gt;, &lt;a href="https://www.ray.io/"&gt;Ray&lt;/a&gt;, and &lt;a href="https://cloud.google.com/vertex-ai"&gt;Google Cloud Vertex AI&lt;/a&gt;, you can dynamically create and manage Ray clusters to execute jobs and clean up resources afterward. In this post, we’ll work through a process to automate these steps.&lt;/p&gt;
&lt;h2 id="why-astronomer-ray-and-vertex-ai"&gt;&lt;strong&gt;Why Astronomer, Ray, and Vertex AI?&lt;/strong&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Astronomer&lt;/strong&gt; provides a managed Apache Airflow platform to orchestrate complex workflows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ray&lt;/strong&gt; enables distributed computing for machine learning, deep learning, and data-intensive applications.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Vertex AI&lt;/strong&gt; offers managed services for scalable and cost-effective resource management on Google Cloud.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="benefits-of-using-astronomer-with-ray-on-vertex-ai"&gt;&lt;strong&gt;Benefits of Using Astronomer with Ray on Vertex AI&lt;/strong&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Dynamic Ray Cluster Configuration&lt;/strong&gt;: Spin up Ray clusters that are sized to the needs of the Ray job with data retrieved from sources like Snowflake or Redshift.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cost Control&lt;/strong&gt;: Spin down resources whenever the job completes to save on cluster resource costs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Simplified Local Development&lt;/strong&gt;: Easily set up and test workflows locally before deploying to production.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Effortless Deployment&lt;/strong&gt;: Astronomer CLI provides one command to push updates to the Astronomer platform.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Version Control&lt;/strong&gt;: Seamlessly integrates with Git for managing your DAGs and dependencies.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="pre-requisites"&gt;&lt;strong&gt;Pre-requisites&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Before you begin, ensure the following:&lt;/p&gt;</description></item><item><title>Using LLMs and Vector Stores to search the Berkshire Hathaway Letters</title><link>https://justinrmiller.github.io/using-llms-and-vector-stores-to-search-the-berkshire-hathaway-letters/</link><pubDate>Mon, 14 Aug 2023 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/using-llms-and-vector-stores-to-search-the-berkshire-hathaway-letters/</guid><description>&lt;p&gt;Extracting information from text can be a time consuming task, especially when you&amp;rsquo;re looking for a needle in a haystack. Warren Buffett has had a long and successful (perhaps the most successful?) career in investing and provides a number of anecdotes and words of wisdom in the form the Berkshire Hathaway shareholder letters. In this blog post I will lay out a method of extracting this wisdom via embeddings, a vector store, a large language model and a search interface via Streamlit.&lt;/p&gt;</description></item><item><title>CockroachDB Introduction</title><link>https://justinrmiller.github.io/cockroachdb-introduction/</link><pubDate>Thu, 08 Jun 2017 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/cockroachdb-introduction/</guid><description>&lt;p&gt;This blog post is a combination of notes and references used in my talk on June 8th, 2017 at ProtectWise.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;SQL&lt;/li&gt;
&lt;li&gt;ACID transactions but also scalable, consistent, and HA&lt;/li&gt;
&lt;li&gt;Raft consensus&lt;/li&gt;
&lt;li&gt;Features both Linearizability and Serializability but without atomic clocks (Spanner)&lt;/li&gt;
&lt;li&gt;Currently supported languages:
&lt;ul&gt;
&lt;li&gt;C++&lt;/li&gt;
&lt;li&gt;Clojure&lt;/li&gt;
&lt;li&gt;Go&lt;/li&gt;
&lt;li&gt;Java&lt;/li&gt;
&lt;li&gt;Node.JS&lt;/li&gt;
&lt;li&gt;PHP&lt;/li&gt;
&lt;li&gt;Python&lt;/li&gt;
&lt;li&gt;Ruby&lt;/li&gt;
&lt;li&gt;Rust&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://jepsen.io/analyses/cockroachdb-beta-20160829"&gt;Aphyr&amp;rsquo;s Review&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="definitions"&gt;Definitions&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Linearizability - absolutely ordered events&lt;/li&gt;
&lt;li&gt;Serializability - guarantees that the constituent reads and writes within a transaction occur as though that transaction were given exclusive access to the database for the length of its execution, guaranteeing that no transactions interfere with each other&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="further-reading"&gt;Further reading&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="https://www.cockroachlabs.com/blog/living-without-atomic-clocks/"&gt;Cockroach Labs - Living without Atomic Clocks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cockroachlabs.com/blog/better-sql-joins-in-cockroachdb/"&gt;On the Way to Better SQL Joins in CockroachDB&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cockroachlabs.com/docs/known-limitations.html"&gt;Known Limitations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/cockroachdb/cockroach/blob/master/docs/design.md"&gt;CockroachDB Design Document&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf"&gt;Google Spanner Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Source code for Bank demo here: &lt;a href="https://justinrmiller.github.io/assets/populate_bank.py"&gt;populate_bank.py&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Scala Collections Series, Part 3 - Operations on Scala Collections, File I/O and Trivia</title><link>https://justinrmiller.github.io/scala-collections-series-part-3-operations-on-scala-collections-file-i/o-and-trivia/</link><pubDate>Sun, 28 May 2017 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/scala-collections-series-part-3-operations-on-scala-collections-file-i/o-and-trivia/</guid><description>&lt;p&gt;This blog post is the last of a three part series, this section covers:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Operations on Scala Collections&lt;/li&gt;
&lt;li&gt;File I/O&lt;/li&gt;
&lt;li&gt;Trivia&lt;/li&gt;
&lt;/ol&gt;
&lt;h4 id="operations-on-scala-collections"&gt;Operations on Scala Collections&lt;/h4&gt;
&lt;p&gt;It&amp;rsquo;s possible to compose operations on collections, here are a few examples.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-scala" data-lang="scala"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scala&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="n"&gt;listOfSentences&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;&amp;#39;Tis better to have loved and lost than never to have loved at all. #lovedthanlost&amp;#34;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;It ain&amp;#39;t what you don&amp;#39;t know that gets you into trouble, it&amp;#39;s what you know for sure that just ain&amp;#39;t so. #marktwain&amp;#34;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;#moon #onesmallstepforman That&amp;#39;s one small step for man, one giant leap for mankind.&amp;#34;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;listOfSentences&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;List&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&amp;#39;Tis &lt;span class="n"&gt;better&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;have&lt;/span&gt; &lt;span class="n"&gt;loved&lt;/span&gt; &lt;span class="n"&gt;and&lt;/span&gt; &lt;span class="n"&gt;lost&lt;/span&gt; &lt;span class="n"&gt;than&lt;/span&gt; &lt;span class="n"&gt;never&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;have&lt;/span&gt; &lt;span class="n"&gt;loved&lt;/span&gt; &lt;span class="n"&gt;at&lt;/span&gt; &lt;span class="n"&gt;all&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;lovedthanlost&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;It&lt;/span&gt; &lt;span class="n"&gt;ain&lt;/span&gt;&amp;#39;t &lt;span class="n"&gt;what&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;don&lt;/span&gt;&amp;#39;t &lt;span class="n"&gt;know&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="n"&gt;gets&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;into&lt;/span&gt; &lt;span class="n"&gt;trouble&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&amp;#39;s &lt;span class="n"&gt;what&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;know&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sure&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="n"&gt;just&lt;/span&gt; &lt;span class="n"&gt;ain&lt;/span&gt;&amp;#39;t &lt;span class="n"&gt;so&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;marktwain&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;moon&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;onesmallstepforman&lt;/span&gt; &lt;span class="nc"&gt;That&lt;/span&gt;&amp;#39;s &lt;span class="n"&gt;one&lt;/span&gt; &lt;span class="n"&gt;small&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;man&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;one&lt;/span&gt; &lt;span class="n"&gt;giant&lt;/span&gt; &lt;span class="n"&gt;leap&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;mankind&lt;/span&gt;&lt;span class="o"&gt;.)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scala&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="n"&gt;numberOfWords&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="n"&gt;listOfSentences&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flatMap&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34; &amp;#34;&lt;/span&gt;&lt;span class="o"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;numberOfWords&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;Int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scala&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="n"&gt;distinctWords&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="n"&gt;listOfSentences&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;.&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;!&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="n"&gt;contains&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;))).&lt;/span&gt;&lt;span class="n"&gt;flatMap&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34; &amp;#34;&lt;/span&gt;&lt;span class="o"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;toSet&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;distinctWords&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;scala.collection.immutable.Set&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;it&lt;/span&gt;&amp;#39;s&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trouble&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;That&lt;/span&gt;&amp;#39;s&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;giant&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;have&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;lovedthanlost&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;than&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mankind&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sure&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;what&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;so&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;moon&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;all&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;just&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;marktwain&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ain&lt;/span&gt;&amp;#39;t&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;man&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;don&lt;/span&gt;&amp;#39;t&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lost&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;know&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;small&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &amp;#39;Tis&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;at&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;leap&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;onesmallstepforman&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loved&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gets&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;It&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;into&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;better&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;and&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;one&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;never&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scala&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;distinctWords&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;res102&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;Int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;38&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scala&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="n"&gt;hashtags&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="n"&gt;listOfSentences&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;.&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;!&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="n"&gt;contains&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;))).&lt;/span&gt;&lt;span class="n"&gt;flatMap&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34; &amp;#34;&lt;/span&gt;&lt;span class="o"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;&amp;#39;#&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;hashtags&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;List&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;lovedthanlost&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;marktwain&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;moon&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;#&lt;/span&gt;&lt;span class="n"&gt;onesmallstepforman&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h4 id="file-io"&gt;File I/O&lt;/h4&gt;
&lt;p&gt;Input file:&lt;/p&gt;</description></item><item><title>Scala Collections Series, Part 2 - More Fundamental Structures</title><link>https://justinrmiller.github.io/scala-collections-series-part-2-more-fundamental-structures/</link><pubDate>Fri, 26 May 2017 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/scala-collections-series-part-2-more-fundamental-structures/</guid><description>&lt;p&gt;This blog post is the middle of a three part series, this section covers:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Sets&lt;/li&gt;
&lt;li&gt;Maps (HashMap and TreeMap)&lt;/li&gt;
&lt;li&gt;Array and List Buffers&lt;/li&gt;
&lt;li&gt;Queues and Stacks&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For performance characteristics of the various data structures listed below, please see: &lt;a href="http://docs.scala-lang.org/overviews/collections/performance-characteristics"&gt;Performance Characteristics&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="sets"&gt;Sets&lt;/h3&gt;
&lt;p&gt;Sets in Scala are relatively straightforward. There are few interesting bits of trivia around Sets which will be covered in the trivia section.&lt;/p&gt;
&lt;p&gt;For now, here are a number of examples of their use:&lt;/p&gt;</description></item><item><title>Scala Collections Series, Part 1 - Mutability and Lists</title><link>https://justinrmiller.github.io/scala-collections-series-part-1-mutability-and-lists/</link><pubDate>Wed, 24 May 2017 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/scala-collections-series-part-1-mutability-and-lists/</guid><description>&lt;p&gt;This blog post is the beginning of a three part series that will cover a variety of topics around Scala Collections, this part will cover:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Mutable and Immutable Collections&lt;/li&gt;
&lt;li&gt;Lists&lt;/li&gt;
&lt;li&gt;Traversable and Iterable&lt;/li&gt;
&lt;li&gt;Sequence Traits (Seq, IndexedSeq, LinearSeq)&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="mutable-and-immutable-collections"&gt;Mutable and Immutable Collections&lt;/h2&gt;
&lt;p&gt;Collections packaged in the Scala Collections library can have mutable and immutable variants, generally separated by package name. This can make switching between the two very straightforward, but may require aliasing them if you use immutable and mutable data structures in the same file.&lt;/p&gt;</description></item><item><title>Scala Days 2016</title><link>https://justinrmiller.github.io/scala-days-2016/</link><pubDate>Mon, 20 Jun 2016 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/scala-days-2016/</guid><description>&lt;p&gt;This year I wasn&amp;rsquo;t able to make it to Scala Days (hoping next year I&amp;rsquo;ll get to go!). Quite a few of the talks though seemed really interesting and luckily the Scala Days talks were made available shortly after the conference. Here&amp;rsquo;s my notes from a few of the videos (I&amp;rsquo;ll add more as I have time).&lt;/p&gt;
&lt;p&gt;First, here&amp;rsquo;s a link to all the videos: &lt;a href="https://www.youtube.com/channel/UCOHg8YCiyMVRRxb3mJT_0Mg/videos"&gt;Scala Days NY 2016&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=JF-ttZyNa84"&gt;Martin Odersky, Keynote: Scala&amp;rsquo;s Road Ahead&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;Slide deck: &lt;a href="http://www.slideshare.net/Odersky/scala-days-nyc-2016"&gt;slides&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Scala Center (2:15)
&lt;ul&gt;
&lt;li&gt;a foundation supported by Lightbend, IBM, etc. with a focus on projects for the Scala community&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Scala 2.12 (3:00)
&lt;ul&gt;
&lt;li&gt;Targeted for mid-2016 release&lt;/li&gt;
&lt;li&gt;Uses Java 8 lambdas/default methods for better performance&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Programming in Scala v3 is out and it&amp;rsquo;s updated for 2.12&lt;/li&gt;
&lt;li&gt;Scala 2.13 (5:05)
&lt;ul&gt;
&lt;li&gt;Focus on the libraries&lt;/li&gt;
&lt;li&gt;Simpler libraries/in line Spark usage/lazy collections (views)&lt;/li&gt;
&lt;li&gt;Potentially splitting Scala standard library into core and platform&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Scala.js (10:00)
&lt;ul&gt;
&lt;li&gt;Version 0.6.9&lt;/li&gt;
&lt;li&gt;Anonymous classes/JUnit support/new site&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;DOT (11:00)
&lt;ul&gt;
&lt;li&gt;Proven to be a sound foundation for Scala&lt;/li&gt;
&lt;li&gt;A minimal language subset about which formal statements can be made and proven&lt;/li&gt;
&lt;li&gt;DOT Terms (13:00) - he covers the Scala notation covered by DOT (slide 17)&lt;/li&gt;
&lt;li&gt;Covers the soundness of the language&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;dotty (16:25)
&lt;ul&gt;
&lt;li&gt;Working name for new Scala compiler&lt;/li&gt;
&lt;li&gt;Builds on DOT in it&amp;rsquo;s internal data structures, but the language will still provide higher level language features (such as generics)&lt;/li&gt;
&lt;li&gt;About twice the speed of nsc (current scala compiler) and should improve significantly in the future&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Dropped features (28:00)
&lt;ul&gt;
&lt;li&gt;Procedure Syntax ( for instance: def someFun(x:String) { x.length } )&lt;/li&gt;
&lt;li&gt;Macros&lt;/li&gt;
&lt;li&gt;Early initializers&lt;/li&gt;
&lt;li&gt;Existential Types&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;New features (32:30)
&lt;ul&gt;
&lt;li&gt;Intersection/Union Types&lt;/li&gt;
&lt;li&gt;Named type parameters&lt;/li&gt;
&lt;li&gt;Non-blocking lazy vals&lt;/li&gt;
&lt;li&gt;Trait parameters&lt;/li&gt;
&lt;li&gt;@static methods and fields&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Improvements in detail (39:30)&lt;/li&gt;
&lt;li&gt;Read this:
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.lihaoyi.com/post/StrategicScalaStylePrincipleofLeastPower.html"&gt;Strategic Scala Style: Principle of Least Power&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;h2 id="tim-soethout-implicits-inspected-and-explained"&gt;&lt;a href="https://www.youtube.com/watch?v=UHQbj-_9r8A"&gt;Tim Soethout, Implicits Inspected and Explained&lt;/a&gt;&lt;/h2&gt;
&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Category Theory for Programmers</title><link>https://justinrmiller.github.io/category-theory-for-programmers/</link><pubDate>Mon, 23 May 2016 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/category-theory-for-programmers/</guid><description>&lt;p&gt;Lately I&amp;rsquo;ve become interested in category theory and it&amp;rsquo;s application to software engineering. After watching a video titled &amp;ldquo;Programming isn&amp;rsquo;t Math&amp;rdquo; by Oscar Boykin, I decided to spend a bit of time watching videos on libraries out of Twitter and conferences around category theory, Algebird, Storehaus and other libraries. Here&amp;rsquo;s a list of a few videos along with a brief summary of each.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=JF-ttZyNa84"&gt;Oscar Boykin, Programming Isn&amp;rsquo;t Math&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;generalizing a matrix library with monoids&lt;/li&gt;
&lt;li&gt;&lt;a href="https://twitter.github.io/algebird/api/#com.twitter.algebird.HyperLogLog$"&gt;HyperLogLog&lt;/a&gt; as a monoid
&lt;ul&gt;
&lt;li&gt;cardinality with ~1% error for large cardinalities w/storage of ~16 KB&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;at around 28:00 mentions int2BigEndian isn&amp;rsquo;t a true bijection, interesting example of a coded bijection that isn&amp;rsquo;t truly a bijection but still useful&lt;/li&gt;
&lt;li&gt;mentioned &lt;a href="https://github.com/non/spire"&gt;Spire&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=wTE56cpttTk"&gt;Ian O&amp;rsquo;Connell, Algebird: SF Scala @Twitter&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Exporting and plotting GPS coordinates from cell phone photos</title><link>https://justinrmiller.github.io/exporting-and-plotting-gps-coordinates-from-cell-phone-photos/</link><pubDate>Tue, 22 Dec 2015 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/exporting-and-plotting-gps-coordinates-from-cell-phone-photos/</guid><description>&lt;p&gt;Recently I was interested in seeing all the places I&amp;rsquo;ve taken photos and found a way of extracting exif data (specifically lat/long) from images stored on my cell phone. Here are the steps I used.&lt;/p&gt;
&lt;h4 id="generate-the-csv-file"&gt;Generate the CSV file&lt;/h4&gt;
&lt;p&gt;To begin, start by installing exiftool. This can done on OS X by running &amp;ldquo;brew install exiftool&amp;rdquo; (assuming you have Homebrew installed).&lt;/p&gt;
&lt;p&gt;The following will generate a CSV file (photogps.csv) from jpg images containing the name of the file, GPS lat, and GPS long.&lt;/p&gt;</description></item><item><title>FileVault 2 Performance Numbers</title><link>https://justinrmiller.github.io/filevault-2-performance-numbers/</link><pubDate>Thu, 04 Jun 2015 12:00:00 -0700</pubDate><author>justinrmiller@gmail.com (Justin Miller)</author><guid>https://justinrmiller.github.io/filevault-2-performance-numbers/</guid><description>&lt;p&gt;A quick post to demonstrate the differences in performance when FileVault 2 is enabled vs disabled.&lt;/p&gt;
&lt;p&gt;Today I re-installed OS X and ran various benchmarking tools (with and without FileVault 2 enabled). Here are the results:&lt;/p&gt;
&lt;p&gt;FileVault 2 Enabled:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://justinrmiller.github.io/assets/filevault2_enabled.jpg" alt="FileVault 2 Enabled Performance Numbers"&gt;&lt;/p&gt;
&lt;p&gt;FileVault 2 Disabled:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://justinrmiller.github.io/assets/filevault2_disabled.jpg" alt="FileVault 2 Disabled Performance Numbers"&gt;&lt;/p&gt;
&lt;p&gt;As we can see from the numbers above (I ran them repeatedly to ensure there wasn&amp;rsquo;t much of a difference between runs), the performance penalty is not nearly as pronounced as one might expect. As a result I plan using full-disk encrypting both my laptop and Time Machine backup.&lt;/p&gt;</description></item></channel></rss>