From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id B766E80175 for ; Thu, 15 Jun 2017 12:38:23 +0200 (CEST) Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=ivg@ieee.org; spf=Pass smtp.mailfrom=ivg@ieee.org; spf=None smtp.helo=postmaster@mail-yw0-f170.google.com Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of ivg@ieee.org) identity=pra; client-ip=209.85.161.170; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="ivg@ieee.org"; x-sender="ivg@ieee.org"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of ivg@ieee.org designates 209.85.161.170 as permitted sender) identity=mailfrom; client-ip=209.85.161.170; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="ivg@ieee.org"; x-sender="ivg@ieee.org"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-yw0-f170.google.com) identity=helo; client-ip=209.85.161.170; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="ivg@ieee.org"; x-sender="postmaster@mail-yw0-f170.google.com"; x-conformance=sidf_compatible IronPort-PHdr: =?us-ascii?q?9a23=3AGmEPyhR+73vuTSKa36WuMR7redpsv+yvbD5Q0YIu?= =?us-ascii?q?jvd0So/mwa6yZR2N2/xhgRfzUJnB7Loc0qyN4v+mATRIyK3CmUhKSIZLWR4BhJ?= =?us-ascii?q?detC0bK+nBN3fGKuX3ZTcxBsVIWQwt1Xi6NU9IBJS2PAWK8TW94jEIBxrwKxd+?= =?us-ascii?q?KPjrFY7OlcS30P2594HObwlSijewZbF/IA+qoQnNq8IbnZZsJqEtxxXTv3BGYf?= =?us-ascii?q?5WxWRmJVKSmxbz+MK994N9/ipTpvws6ddOXb31cKokQ7NYCi8mM30u683wqRbD?= =?us-ascii?q?VwqP6WACXWgQjxFFHhLK7BD+Xpf2ryv6qu9w0zSUMMHqUbw5Xymp4rx1QxH0li?= =?us-ascii?q?gIKz858HnWisNuiqJbvAmhrAF7z4LNfY2ZKOZycqbbcNwdWGRBQ91RVzRfDYyg?= =?us-ascii?q?c4sBAe0BPeNCoIn8oVsFsB+yCAaoCe/qzDJDm3340rAg0+k5Ew7G0gwuEdwNvn?= =?us-ascii?q?rJstv6KLwfUeWpwKTS1zjPc+9a1DX75YPVch4hu/aMXbdofMTS10kgDQXFhUiR?= =?us-ascii?q?p4ziIzOV0foNvHSb7+phSeKvkHMspgZwojixycchkYjJiZwLxV/a7yl5x5w1Jd?= =?us-ascii?q?KhRUN9fNWqHpxQtySAOIt3RMMvW2BouCAgyr0Ho5G3ZiYKyI4/yx/fcfOHc4+I?= =?us-ascii?q?4hX5WOmNJjd4gXRoc6+8iRaq6UWs1PHwW82u3FtJridJiMTAu3EQ2xDJ98SKSO?= =?us-ascii?q?dx80G80jiVzQ/T8PtLIUUsmKrbNZEhxrkwm4IWsUvZHy/2nFz6ja+Yd0k44+So?= =?us-ascii?q?5fnrb7f6qpOGOI90jQb+MqsqmsOhG+g3Lg8OX22D9eS90r3s41H5Ta1UgvEqlq?= =?us-ascii?q?TVqpPXKMQBqqKkAgJZz5wv5wu9Aju6yNgYmGMILFNBeBKJlYjpPFTOLej5Dfeh?= =?us-ascii?q?jFShizZryO7YMbL/GJnNKWLDkLj5cbZn90Fc0BYzzcxY559MFr4OOvfzWkvouN?= =?us-ascii?q?zcDx85KBC0zv38CNR904MeQXiADrWYMKPUq1+I5/ggL/OCZI8P637BLK0f4Pjn?= =?us-ascii?q?izcdlBc9bOH9x5wRYXb+GvlmMm2WZHPthpEKFmJc7SQkS+m/qUOLV3Z8YGq1Qa?= =?us-ascii?q?k85y0gQNanE4jrR42gjfqGxijtTc4eXXxPFl3ZSSSgTI6DQfpZLX/LLw=3D=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0CBAAC/YkJZhqqhVdFcHAEBBAEBCgEBG?= =?us-ascii?q?AYMgkWBS4ENB4NvgTaaWoJAgn6CbY8OA1wHJYV4AoJVB0IVAQEBAQEBAQEBAQE?= =?us-ascii?q?SAQEBCAsLCCgvgjMkgkIBAgIBIwQZAQEpAwsBBAsJAgsaHQICAh8BEgEFAQoSB?= =?us-ascii?q?hMICok1TQMIBQgQjFmRGj+LHWuBbDqDCQEBBYQpAwqECAEBAQEBAQEDAQEBAQE?= =?us-ascii?q?BAQEBEAcIEoQbgjWBYIMigliBfisWgmWCYQGHcwyJBotzgRo7hBKCHn2HQIRng?= =?us-ascii?q?gdVhHGKPotRh2YUH4EVDyaBLIEBCEkZBoQ1Kh8lgU8+NgGHE4I+AQEB?= X-IPAS-Result: =?us-ascii?q?A0CBAAC/YkJZhqqhVdFcHAEBBAEBCgEBGAYMgkWBS4ENB4N?= =?us-ascii?q?vgTaaWoJAgn6CbY8OA1wHJYV4AoJVB0IVAQEBAQEBAQEBAQESAQEBCAsLCCgvg?= =?us-ascii?q?jMkgkIBAgIBIwQZAQEpAwsBBAsJAgsaHQICAh8BEgEFAQoSBhMICok1TQMIBQg?= =?us-ascii?q?QjFmRGj+LHWuBbDqDCQEBBYQpAwqECAEBAQEBAQEDAQEBAQEBAQEBEAcIEoQbg?= =?us-ascii?q?jWBYIMigliBfisWgmWCYQGHcwyJBotzgRo7hBKCHn2HQIRnggdVhHGKPotRh2Y?= =?us-ascii?q?UH4EVDyaBLIEBCEkZBoQ1Kh8lgU8+NgGHE4I+AQEB?= X-IronPort-AV: E=Sophos;i="5.39,343,1493676000"; d="ml'?scan'208,217";a="228405315" Received: from mail-yw0-f170.google.com ([209.85.161.170]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 15 Jun 2017 12:38:21 +0200 Received: by mail-yw0-f170.google.com with SMTP id 63so3664902ywr.0 for ; Thu, 15 Jun 2017 03:38:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ieee-org.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=AyLMbOPWdGQKJTSvaVFgdhmhfg62ftc1MyA5ULp10oM=; b=ZrKJqt5pW6KtzeAXl1lHNbYJOYq1JWPJWUAz+vmywZ/1alDbGsB6utMjGm1hih3IKJ DCCbRp90G+XbVkv0x6jY1HU5vuZQoDP9C5I3dcw8YhWvX0CmZZVVSAQatL5zxHfY/C5W yNIaT3OPp50xHLP4ExJyNF3vNcgV3nGSb+5rEk/PMJzWbQuAioq8cGCyniRw+nG/16EN udQ/PLEYphIbmx0nOKVwArNg0d+fsPWDuruHrotuGsLhyT8Ul931u943cVw5fQLKl06g nIM4lfoOoiGIzPekH0PuB+RFGWO0Tj19uSRWAyN4gHlENn+IES1EusGGmg6Q77kLvhnl YpVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=AyLMbOPWdGQKJTSvaVFgdhmhfg62ftc1MyA5ULp10oM=; b=QAVIL1a1+UQWWkCDsxszPwEuaPDVgwM00Pknnqsh3OkKq3w6i1lq7EL6YtEdv7Nw3r B8puj8Wdj/eXAw2I7Fma99cLpUH0EnIvbMAMQ1VwRGEP5Eod8pdmfuiuyxsLfU+BK+z9 vLMJJ515tcj6tkKtgx7+o2IaW0Z67MoCDg4uiQo0JgoXfI1JkUjI7q3X3+sJg41VR9YJ dK302wCM505uAjBCMEMBqtn7dGjKv3uGIp/caa8WK4DOuvrFCRM1JyLHCR99ifOmdHY9 pSpLD1O5DVfZtKI8zJkpVheGB0/fe9PHyEALmnmqWYXo/XoWYhxDkpfCQonFx8xM3ww7 fSHA== X-Gm-Message-State: AKS2vOylcxnLy3sYU44ORoAyBtRBrNLDiOx6XG7bmsoB+KNQQFY0LTFM b3yTcJeHjKoTmkAO+IRJDcSelSVpeWPk X-Received: by 10.13.232.10 with SMTP id r10mr4026740ywe.161.1497523099708; Thu, 15 Jun 2017 03:38:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.129.26.1 with HTTP; Thu, 15 Jun 2017 03:38:18 -0700 (PDT) In-Reply-To: References: <412f58b7-8a8b-2356-2626-e1bd010be683@bioreg.kyushu-u.ac.jp> From: Ivan Gotovchits Date: Thu, 15 Jun 2017 12:38:18 +0200 Message-ID: To: Ronan Le Hy Cc: Francois BERENGER , OCaml Mailing List Content-Type: multipart/mixed; boundary="94eb2c0886586708250551fd44ef" Subject: Re: [Caml-list] Can this code be accelerated by porting it to SPOC, SAREK or MetaOCaml ? --94eb2c0886586708250551fd44ef Content-Type: multipart/alternative; boundary="94eb2c0886586708210551fd44ed" --94eb2c0886586708210551fd44ed Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Fran=C3=A7ois, Your code is actually bound by memory, not by computation. The problem is that you create lots of data, and all the time is spent in the allocation functions and GC. The actual computation consumes less than 5% of overall program execution, that can be easily seen by profiling (always profile before starting optimization). Thus moving to GPU will make this 5% of code run faster, at the expense of even higher overhead of moving data between CPU and GPU. So, let's see how we can optimize your code. First of all, your code uses too much mutable state, that can have a negative performance impact in OCaml due to write barriers. Let's make it a little bit cleaner and faster: let nonmutab points =3D let rec loop xs acc =3D match xs with | [] -> acc | (x,w) :: xs -> List.fold_left (fun acc (x',w') -> (Vector3.dist x x', w *. w') :: acc) acc xs |> loop xs in loop points [] This code runs a little bit faster (on 10000 points): original: 14084.1 ms nonmutab: 11309.2 ms It computes absolutely the same result, using the same computational core, and allocating the same amount of trash. The only difference is that we are not using a mutable state anymore, and we are rewarded for that. The next step is to consider, do you really need to produce these intermediate structures, if the result of your program is the computed data, then you can just store it in a file. Or, if you need later this data to be reduced to some value, then you shouldn't create an intermediate result, and apply the reduction in place (so called de-foresting). So, let's generalize a little bit the function so that instead of producing a new list, the function just applies a user-provided function to all produced elements, along with an accumulator. let nonalloc f acc points =3D let rec loop xs acc =3D match xs with | [] -> acc | (x,w) :: xs -> List.fold_left (fun acc (x',w') -> f acc (Vector3.dist x x') (w *. w')) acc xs |> loop xs in loop points acc This function will not allocate any unnecessary data (it will though allocate a pair of floats per each call). So we can use this function to implement the `nonmutab` version, by passing a consing operator and an empty list. We can also try to use it to store data. The data storage process depends on how fast we can reformat the data, and how fast is the sink device. If we will output to /dev/null (that is known to be fast), then we can use the Marshal module and get about 300 MB/s throughtput. Not bad, but still about 10 seconds of running time. If, for example, we just need some scalar metrics, then we don't need to pay an overhead of data, and it will be as fast as possible. For example, with the following implementations of the kernel function let flags =3D [Marshal.No_sharing] let print () x1 x2 =3D Marshal.to_channel stdout x1 flags; Marshal.to_channel stdout x2 flags let product total x1 x2 =3D total +. x1 *. x2 We will have the following timings: printall: 9649.54 ms nonalloc: 541.186 ms So, the actual computation took only half a second, the rest is data processing. Please find attached the whole example. You can run it with the following command: ocamlbuild -pkgs vector3,unix points.native -- > /dev/null Regards, Ivan Gotovchits On Thu, Jun 15, 2017 at 9:09 AM, Ronan Le Hy wrote: > Hi Fran=C3=A7ois, > > 2017-06-15 8:28 GMT+02:00 Francois BERENGER jp>: > > I am wondering if some high performance OCaml experts out there > > can know in advance if some code can go faster by executing it > > on a GPU. > > I have some clear bottleneck in my program. > > Here is how the code looks like: > > --- > > let f (points: (Vector3.t * float) list) =3D > > let acc =3D ref [] in > > let ac p1 x (p2, y) =3D > > acc :=3D (Vector3.dist p1 p2, x *. y) :: !acc > > in > > let rec loop =3D function > > | [] -> () > > | (p1, x) :: xs -> > > L.iter (ac p1 x) xs; > > loop xs > > in > > loop points; > > !acc > > As a baseline before attempting anything on the GPU, I'd vectorize > this. Put all your vectors in a big matrix. Put all your numbers in a > vector. Compute the distances and the products all at once using > Lacaml (make sure you use OpenBLAS as a backend). I'd expect it to be > much faster than the above loop already. > > -- > Ronan > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > --94eb2c0886586708210551fd44ed Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Fran=C3=A7ois,<= br>

Your code is actually bound by memory, not by computa= tion. The problem is that you create lots of data, and all the time is spen= t in the allocation functions and GC. The actual computation consumes less = than 5% of overall program execution, that can be easily seen by profiling = (always profile before starting optimization). Thus moving to GPU will make= this 5% of code run faster, at the expense of even higher overhead of movi= ng data between CPU and GPU.=C2=A0

So, let= 9;s see how we can optimize your code. First of all, your code uses too muc= h mutable state, that can have a negative performance impact in OCaml due t= o write barriers. Let's make it a little bit cleaner and faster:=
=C2=A0 =C2=A0
<= blockquote style=3D"margin:0px 0px 0px 40px;border:none;padding:0px">
<= div>le= t nonmutab points =3D=C2=A0
=C2=A0 let rec lo= op xs acc =3D match xs with
=C2=A0 =C2=A0 | [= ] -> acc
=C2=A0 =C2=A0 | (x,w) :: xs ->= =C2=A0
= =C2=A0 =C2=A0 =C2=A0 List.fold_left (fu= n acc (x',w') ->=C2=A0
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 (Vector3.dist x x', w *. w') :: acc) acc x= s |>=C2=A0
=C2=A0 =C2=A0 =C2=A0 loop xs in<= /font>
=C2=A0 loop points []

This code runs a little bit faster (on 10000 poi= nts):

original:= 14084.1 ms
nonmutab: 11309.2 ms=

It computes absolutely the same result, = using the same computational core, and allocating the same amount of trash.= The only difference is that we are not using a mutable state anymore, and = we are rewarded for that.

The next step is to consider, do you really n= eed to produce these intermediate structures, if the result of your program= is the computed data, then you can just store it in a file. Or, if you nee= d later this data to be reduced to some value, then you shouldn't creat= e an intermediate result, and apply the reduction in place (so called de-fo= resting). So, let's generalize a little bit the function so that instea= d of producing a new list, the function just applies a user-provided functi= on to all produced elements, along with an accumulator.=C2=A0

let nonalloc f acc points =3D= =C2=A0
= =C2=A0 let rec loop xs acc =3D match xs= with
<= font face=3D"monospace, monospace">=C2=A0 =C2=A0 | [] -> acc
=C2=A0 =C2=A0 | (x,w) :: xs ->=C2=A0
=C2=A0 =C2=A0 =C2=A0 List.fold_left (fun acc (x',w') -&= gt;=C2=A0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 f= acc (Vector3.dist x x') (w *. w')) acc xs |>=C2=A0
=C2=A0 =C2=A0 =C2=A0 loop xs in
<= div>
=C2=A0 loop points acc


This function will not allocate any unne= cessary=C2=A0data (it will though allocate a pair of floats per each call).= So we can use this function to implement the `nonmutab` version, by passin= g a consing operator and an empty list. We can also try to use it to store = data. The data storage process depends on how fast we can reformat the data= , and how fast is the sink device. If we will output to /dev/null (that is = known to be fast), then we can use the Marshal module and get about 300 MB/= s throughtput.=C2=A0Not bad, but still about 10 seconds of running time. If= , for example, we just need some scalar metrics, then we don't need to = pay an overhead of data, and it will be as fast as possible. For example, w= ith the following implementations of the kernel function

let flags =3D [Marshal.No_sharing]= =C2=A0

let print () x1 x2 =3D
=C2=A0 Marshal.to_channel stdout x1 flags;
=C2=A0 Marshal.to_channel stdout x2 flags
<= /div>

let produc= t total x1 x2 =3D total +. x1 *. x2

We will have the following timings:

printall: 9649.54 ms
nonalloc: 541.186 ms
<= div style=3D"font-size:12.8px">
So, the actual computation took only half a second, the rest is d= ata processing.

Please find attached the whole example. You can run i= t with the following command:

ocamlbuild= -pkgs vector3,unix points.native -- > /dev/null

Regards,
Ivan Gotovchits
=

=
On Thu, Jun 15, 2017 at 9:09 AM, Ronan Le Hy= <ronan.lehy@gmail.com> wrote:
Hi Fran=C3=A7ois,

2017-06-15 8:28 GMT+02:00 Francois BERENGER <berenger@bioreg.kyushu-u.ac.jp>:
> I am wondering if some high performance OCaml experts out there
> can know in advance if some code can go faster by executing it
> on a GPU.
> I have some clear bottleneck in my program.
> Here is how the code looks like:
> ---
> let f (points: (Vector3.t * float) list) =3D
>=C2=A0 =C2=A0let acc =3D ref [] in
>=C2=A0 =C2=A0let ac p1 x (p2, y) =3D
>=C2=A0 =C2=A0 =C2=A0acc :=3D (Vector3.dist p1 p2, x *. y) :: !acc
>=C2=A0 =C2=A0in
>=C2=A0 =C2=A0let rec loop =3D function
>=C2=A0 =C2=A0 =C2=A0| [] -> ()
>=C2=A0 =C2=A0 =C2=A0| (p1, x) :: xs ->
>=C2=A0 =C2=A0 =C2=A0 =C2=A0L.iter (ac p1 x) xs;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0loop xs
>=C2=A0 =C2=A0in
>=C2=A0 =C2=A0loop points;
>=C2=A0 =C2=A0!acc

As a baseline before attempting anything on the GPU, I'd vectori= ze
this. Put all your vectors in a big matrix. Put all your numbers in a
vector. Compute the distances and the products all at once using
Lacaml (make sure you use OpenBLAS as a backend). I'd expect it to be much faster than the above loop already.

--
Ronan

--
Caml-list mailing list.=C2=A0 Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

--94eb2c0886586708210551fd44ed-- --94eb2c0886586708250551fd44ef Content-Type: application/octet-stream; name="points.ml" Content-Disposition: attachment; filename="points.ml" Content-Transfer-Encoding: base64 X-Attachment-Id: f_j3yajbyr0 b3BlbiBQcmludGYKCgpsZXQgb3JpZ2luYWwgKHBvaW50czogKFZlY3RvcjMu dCAqIGZsb2F0KSBsaXN0KSA9CiAgbGV0IGFjYyA9IHJlZiBbXSBpbgogIGxl dCBhYyBwMSB4IChwMiwgeSkgPQogICAgYWNjIDo9IChWZWN0b3IzLmRpc3Qg cDEgcDIsIHggKi4geSkgOjogIWFjYwogIGluCiAgbGV0IHJlYyBsb29wID0g ZnVuY3Rpb24KICAgIHwgW10gLT4gKCkKICAgIHwgKHAxLCB4KSA6OiB4cyAt PgogICAgICBMaXN0Lml0ZXIgKGFjIHAxIHgpIHhzOwogICAgICBsb29wIHhz CiAgaW4KICBsb29wIHBvaW50czsKICAhYWNjCgpsZXQgbm9ubXV0YWIgcG9p bnRzID0gCiAgbGV0IHJlYyBsb29wIHhzIGFjYyA9IG1hdGNoIHhzIHdpdGgK ICAgIHwgW10gLT4gYWNjCiAgICB8ICh4LHcpIDo6IHhzIC0+IAogICAgICBM aXN0LmZvbGRfbGVmdCAoZnVuIGFjYyAoeCcsdycpIC0+IAogICAgICAgICAg KFZlY3RvcjMuZGlzdCB4IHgnLCB3ICouIHcnKSA6OiBhY2MpIGFjYyB4cyB8 PiAKICAgICAgbG9vcCB4cyBpbgogIGxvb3AgcG9pbnRzIFtdCgpsZXQgbm9u YWxsb2MgZiBhY2MgcG9pbnRzID0gCiAgbGV0IHJlYyBsb29wIHhzIGFjYyA9 IG1hdGNoIHhzIHdpdGgKICAgIHwgW10gLT4gYWNjCiAgICB8ICh4LHcpIDo6 IHhzIC0+IAogICAgICBMaXN0LmZvbGRfbGVmdCAoZnVuIGFjYyAoeCcsdycp IC0+IAogICAgICAgICAgZiBhY2MgKFZlY3RvcjMuZGlzdCB4IHgnKSAodyAq LiB3JykpIGFjYyB4cyB8PiAKICAgICAgbG9vcCB4cyBpbgogIGxvb3AgcG9p bnRzIGFjYwoKCmxldCB0aW1laXQgbmFtZSBmIHggPSAKICBHYy5tYWpvciAo KTsKICBsZXQgdDAgPSBVbml4LmdldHRpbWVvZmRheSAoKSBpbgogIGxldCBy ID0gZiB4IGluCiAgbGV0IHQxID0gVW5peC5nZXR0aW1lb2ZkYXkgKCkgaW4K ICBlcHJpbnRmICIlczogJWcgbXNcbiUhIiBuYW1lICgodDEgLS4gdDApICou IDEwMDAuKTsKICByCgpsZXQgcmV2X2luaXQgbiBmID0gCiAgbGV0IHJlYyBs b29wIGkgeHMgPSAKICAgIGlmIGkgPCBuIHRoZW4gbG9vcCAoaSsxKSAoZiBp IDo6IHhzKSBlbHNlIHhzIGluCiAgbG9vcCAwIFtdCgpsZXQgcmFuZG9tX3Bv aW50cyBuID0gcmV2X2luaXQgbiAoZnVuIGkgLT4gCiAgICBWZWN0b3IzLm1h a2UKICAgICAgKFJhbmRvbS5mbG9hdCAxLikKICAgICAgKFJhbmRvbS5mbG9h dCAxLikKICAgICAgKFJhbmRvbS5mbG9hdCAxLiksCiAgICBSYW5kb20uZmxv YXQgMS4pCgpsZXQgZmxhZ3MgPSBbTWFyc2hhbC5Ob19zaGFyaW5nXSAKCmxl dCBwcmludCAoKSB4MSB4MiA9CiAgTWFyc2hhbC50b19jaGFubmVsIHN0ZG91 dCB4MSBmbGFnczsKICBNYXJzaGFsLnRvX2NoYW5uZWwgc3Rkb3V0IHgyIGZs YWdzCgpsZXQgcHJvZHVjdCB0b3RhbCB4MSB4MiA9IHRvdGFsICsuIHgxICou IHgyCgoKCmxldCBtYWluICgpID0gCiAgUmFuZG9tLmluaXQgNDI7CiAgbGV0 IHBvaW50cyA9IHJhbmRvbV9wb2ludHMgMTAwMDAgaW4KICBsZXQgX3IxID0g dGltZWl0ICJvcmlnaW5hbCIgb3JpZ2luYWwgcG9pbnRzIGluCiAgbGV0IF9y MiA9IHRpbWVpdCAibm9ubXV0YWIiIG5vbm11dGFiIHBvaW50cyBpbgogIGxl dCAoKSAgPSB0aW1laXQgInByaW50YWxsIiAobm9uYWxsb2MgcHJpbnQgKCkp IHBvaW50cyBpbgogIGxldCBwcmQgPSB0aW1laXQgIm5vbmFsbG9jIiAobm9u YWxsb2MgcHJvZHVjdCAwLikgcG9pbnRzIGluCiAgZXByaW50ZiAicHJvZHVj dCA9ICVnXG4iIHByZAoKCmxldCAoKSA9IG1haW4gKCkK --94eb2c0886586708250551fd44ef--