This is part three of nine post series on processing a log file with Elixir. If you find this article helpful, please subscribe and share 🚀 In the last post on processing a log file with Elixir, we split each item in a list seperated by spaces into list items inside a list. Looking at our steps defined in the first post on how to process a log file with Elixir, we see that next step is filter the list items to only contain the URL and TCP_HIT/MISS.
Fetch data from URLSplit each new line into a list itemSplit each line into list items- Filter items to only contain the URL and
TCP_HIT/MISS
- Find the six-digit Video ID from the URL, it should be the first integer in HTTP paths of:
"example.com/04C0BF/v2/sources/content-owners/"
"example.com/04C0BF/ads/transcodes/"
- Group by Video ID
- Get Cache Hit and Misses for each Video
- Calculate the Cache Hit Misses
- Sort by Video ID
- Print to file
Our data is now looking something like this:
{:ok,
[
...
["1523756639", "3", "88.110.35.157", "2227424", "152.195.141.240", "80",
"TCP_HIT/200", "2227671", "GET",
"http://example.com/04C0BF/v2/sources/content-owners/cinedigm-itub/398629/v201711170053-2061k.mp4+4582327.ts",
"-", "0", "604", "\"-\"", "\"Mozilla/5.0", "(Linux;", "Android", "5.1.1;",
"AFTM", "Build/LVY48F;", "wv)", "AppleWebKit/537.36", "(KHTML,", "like",
"Gecko)", "Version/4.0", "Chrome/55.0.2883.91", "Mobile", "Safari/537.36\"",
"49343", "\"-\"", ""],
["1523756653", "0", "81.132.50.208", "2227424", "152.195.141.240", "80",
"TCP_HIT/206", "262442", "GET",
"http://example.com/04C0BF/v2/sources/content-owners/cinedigm-itub/398629/v201711170053-2061k.mp4+4582327.ts",
"-", "0", "519", "\"-\"", "\"Roku/DVP-8.0", "(068.00E04155A)\"", "49343",
"\"-\"", ""],
...
]}
We want only the "TCP_HIT/200"
and the string beginning with http
, which contains the video ID that we will later extract from the URL. Lets write a simple test.
defmodule AccessLogAppTest do
use ExUnit.Case
doctest AccessLogApp
...
test "filters data an array containing strings to match" do
data = [
["a1", "TCP_HIT/200", "http://example.com/ABCD/a/b/c/123456/somefile.mp4.ts"],
["a2", "TCP_HIT/206", "http://example.com/ABCD/e/f/789012/someotherfile.mp4.ts"]
]
result = AccessLogApp.CLI.filter_data(data, ["TCP_HIT", "http"])
assert result == [
[["TCP_HIT/200"], ["http://example.com/ABCD/a/b/c/123456/somefile.mp4.ts"]],
[["TCP_HIT/206"], ["http://example.com/ABCD/e/f/789012/someotherfile.mp4.ts"]]
]
end
end
During the process of writing the test, the expected data structure is defined.
defmodule AccessLogApp.CLI do
...
def filter_data(list, strings) do
Enum.map(list, fn item ->
Enum.map(strings, fn string ->
{
String.to_atom(String.downcase(string)),
Enum.at(Enum.filter(item, &String.contains?(&1, string)), 0)
}
end)
end)
end
...
end
In our CLI.ex
file we create a function that takes the list and strings. The Enum.map
function takes each item in the list and runs the anonymous function x
on it. The nested Enum.map
loops through the given strings and runs the Enum.filter
function on each row and checks if the items in each row contain the string. If it does, it returns that item, if it doesn't, the item is discarded. The result is a list of list items containing only the data specified by our matching strings. The result looks like this:
[
[
http: "http://example.com/04C0BF/v2/sources/content-owners/sgl-entertainment/275211/v0401185814-1389k.mp4+740005.ts",
tcp: "TCP_HIT/200"
],
[
http: "http://example.com/04C0BF/v2/sources/content-owners/sgl-entertainment/326260/v20169101326-1256x544-3063k.mp4+3713710.ts",
tcp: "TCP_HIT/200"
],
[
http: "http://example.com/04C0BF/v2/sources/content-owners/cinedigm-itub/398629/v201711170053-2061k.mp4+4582327.ts",
tcp: "TCP_HIT/200"
],
[
http: "http://example.com/04C0BF/v2/sources/content-owners/cinedigm-itub/398629/v201711170053-2061k.mp4+4582327.ts",
tcp: "TCP_HIT/206"
],
[
http: "http://example.com/04C0BF/v2/sources/content-owners/artcast/186001/v0205061236-3219k.mp4+1980019.ts",
...
],
[...],
...
]
In this post we showed a way to get only the items we want from a list. We gave a function a list, and then a list of strings that we want to filter with and return the key value pair we wanted. The result is added to a list of list items. But what if we wanted to return a list of maps instead? Check out Elixir - List vs Maps blog post to see how to add to list, map or tuple. If you like this post, please share and subscribe!
Launch Your Project
Get your project off the ground
with Space-Rocket!
Fill out the form below to get started.