【Wireshark】分割された独自プロトコルメッセージを再組立てして解析

前回、TCPの特徴として、1つのIPパケット内に複数メッセージが含まれる場合の独自プロトコル解析についてスクリプトの作成方法について紹介しました。

taekwongineer.hatenablog.jp

今回は、1つの独自プロトコルメッセージが複数のIPパケットに跨って、分割配信された場合の解析方法について紹介します。

ちなみに下記サイトでは、分割パケットの再組立てができないのであれば、TCPの解析プラグインは書くなと言っています・・・

TCP reassembly
You should not write a dissector for TCP payload if you cannot handle reassembly (i.e., don't add your Proto object to the DissectorTable for tcp).
Lua/Dissectors - The Wireshark Wiki

この記事の目次

１．解析対象の独自プロトコル
２．解析プラグインのサンプルコード
３．まとめ

１．解析対象の独自プロトコル

今回も前回の記事と同様、以下の図1-1のメッセージフォーマットのプロトコルを例に、解析プラグインを作成していきます。

f:id:taekwongineer:20200419163817p:plain

図1-1 解析対象プロトコルのメッセージフォーマット

２．解析プラグインのサンプルコード

前回の記事のサンプルコードに、分割されたメッセージの再組立て処理を行うコードを追記します。
★を付記したコメント部分が追記箇所となります。

--独自プロトコルの定義
SampleProto = Proto.new("sample","SampleProtocol")

--プロトコルフィールドの定義
f_MsgSize = ProtoField.new("MessageSize","sample.size",ftypes.UINT16)
f_Msg = ProtoField.new("Message","sample.msg",ftypes.STRING)

--定義したプロトコルフィールドをプロトコルフィールド配列に登録
SampleProto.fields = {f_MsgSize,f_Msg}

-- dissector関数の定義
function SampleProto.dissector(buffer, pinfo, tree)

    -- dissector関数内で使用するローカル変数の定義
    local MsgNum = 0    -- 1パケット内に含まれるSampleProtocolのメッセージ数
    local BufLeft = buffer:len()    -- buffer内の未読込データの残りバイト数
    local BufPos = 0    -- bufferのデータ読込開始位置
    local r_BufRange    -- サンプルプロトコルメッセージのデータ範囲
    local r_MsgSize    -- MessageSizeのデータ範囲
    local r_Msg    -- Messageのデータ範囲
    
    -- BufLeftが3以上のとき、解析処理を継続
    while BufLeft >= 3 do
        ---- bufferからデータを読込み ----
        -- MessageSizeを読込み
        r_MsgSize = buffer(BufPos, 2)

        -- buffer内の未読込データ数を更新
        BufLeft = BufLeft - 2
        
        -- ★★★★★　追記ここから　★★★★★
        -- Message Sizeとbuffer内の残データ数を比較
        if r_MsgSize:uint() > BufLeft then

            -- buffer内の残データ数よりもMessage Sizeが大きい場合は、
            -- pinfo.desegment_lenに不足データ数をセットし、disecctor関数をreturn
            pinfo.desegment_len = r_MsgSize:uint() - BufLeft
            return buffer:len()
        end
        -- ★★★★★　追記ここまで　★★★★★

        -- SampleProtocolメッセージ全体を読込み
        r_BufRange = buffer(BufPos, 2 + r_MsgSize:uint())

        -- bufferのデータ読込開始位置を更新
        BufPos = BufPos + 2

        -- Messageを読込み
        r_Msg = buffer(BufPos, r_MsgSize:uint())

        -- SampleProtocolメッセージ全体を読込み
        r_BufRange = buffer(BufPos, 2 + r_MsgSize:uint())

        ---- プロトコルツリーに情報を追加
        -- treeにサブツリーとして"SampleProtocol"を追加
        local subtree = tree:add(SampleProto, r_BufRange)

        -- subtreeに上記で定義した各プロトコルフィールドの値を追加
        subtree:add(f_MsgSize, r_MsgSize) -- MessageSize
        subtree:add(f_Msg, r_Msg, r_Msg:string(ENC_UTF_8), "Message:" .. r_Msg:string(ENC_UTF_8)) -- Msg

        -- subtreeのトップにMessageSizeとMessageの情報を追記する
       subtree:append_text(",MessasgeSize:" .. r_MsgSize:uint() .. ",Message:" .. r_Msg:string(ENC_UTF_8))

        ---- bufferのデータ読込開始位置を更新
        BufPos = BufPos + r_MsgSize:uint()

        ---- buffer内の未読込データ数を更新
        BufLeft = BufLeft - r_MsgSize:uint()

        -- SampleProtocolメッセージ数を更新
        MsgNum = MsgNum + 1
    end

    ---- プロトコル情報を設定
    -- "Protocol"列の表示
    pinfo.cols.protocol = "Sample"

    -- "info"列の表示(MsgNumの値に応じて分岐)
    if MsgNum == 1 then
    -- メッセージ数が1個の場合は、MessageSizeとMessageの値を表示
    pinfo.cols.info = "MessageSize:" .. r_MsgSize:uint() .. "Message:" .. r_Msg:string(ENC_UTF_8)
    else
    -- メッセージ数が2個以上の場合は、メッセージ数を表示
    pinfo.cols.info = "Thispacketcontains" .. MsgNum .. "messages."
    end
end

-- 定義したプロトコルをTCPポート番号を指定した紐づけ
tcp_table = DissectorTable.get("tcp.port")
tcp_table:add(20406, SampleProto)

追記部分の内容について解説します。

        -- ★★★★★　追記ここから　★★★★★
        -- Message Sizeとbuffer内の残データ数を比較
        if r_MsgSize:uint() > BufLeft then

            -- buffer内の残データ数よりもMessage Sizeが大きい場合は、
            -- pinfo.desegment_lenに不足データ数をセットし、disecctor関数をreturn
            pinfo.desegment_len = r_MsgSize:uint() - BufLeft
            return buffer:len()
        end
        -- ★★★★★　追記ここまで　★★★★★

Message Sizeに書かれているMessageのデータ数とbuffer内の残データ（未読込部分のデータ）数を比較し、Message Sizeの方が大きい場合は、Message部分のデータが複数IPパケットに分割されていると判断し、再組立ての処理を実行します。

やることはとても簡単で、
①pinfo.desegment_lenに不足データ数を設定
②bufferのサイズをdissector関数の戻り値として返す
の2つです。

上記の追記に関する情報の参照元は以下です。

TCP reassembly
You should not write a dissector for TCP payload if you cannot handle reassembly (i.e., don't add your Proto object to the DissectorTable for tcp). Like dissectors written in C, Lua dissectors can use Wireshark's ability to reassemble TCP streams:
You should make sure your dissector can handle the following conditions:
(1) The TCP packet segment might only have the first portion of your message.
(2) The TCP packet segment might contain multiple of your messages.
(3) The TCP packet might be in the middle of your message, because a previous segment was not captured. For example, if the capture started in the middle of a TCP session, then the first TCP segment will be given to your dissector function, but it may well be a second/third/etc. segment of your protocol's whole message, so appear to be malformed. Wireshark will keep trying your dissector for each subsequent segment as well, so that eventually you can find the beginning of a message format you understand.
(4) The TCP packet might be cut-off, because the user set Wireshark to limit the size of the packets being captured.
(5) Any combination of the above.

～中略～

For case (1), you have to dissect your message enough to figure out what the full length will be - if you can figure that out, then set the Pinfo's desegment_len to how many more bytes than are currently in the Tvb that you need in order to decode the full message;

～中略～

For the return value of your Proto's dissector() function, you should return one of the following:
If the packet does not belong to your dissector, return 0. You must
not set the Pinfo.desegment_len nor the desegment_offset if you return 0.
If you need more bytes, set the Pinfo's desegment_len/desegment_offset as described earlier, and return either nothing, or return the length of the Tvb. Either way is fine.
If you don't need more bytes, return either nothing, or return the length of the Tvb. Either way is fine.

Lua/Dissectors - The Wireshark Wiki