【HSP】2017年版 F1ランキングページを生成してみる【その4】

今回は前回調べた2016年のRaceresult（オーストラリアGP）のダウンロードと必要なデータの取り出しをHSPにてプログラミングしてみたいと思います。

https://tks-kan.com/2017/02/17/987/

【HSP】2017年版 F1ランキングページを生成してみる【その2】

まだシーズン開幕前なので、2017年のリザルトページはデータのない状態。とはいえ、年々更新されるページで毎回仕様変更するとは考えにく...

tks-kan.com

2017-02-15 09:11

https://tks-kan.com/2017/02/13/964/

今回の目的は

以前つくったHSPのプログラムを再利用してデータを取り出す

です。では、いってみましょう！

以前つくったHSPを利用する

ダウンロードやデータの取り出し、文字コード（改行コードの変換）などは以前作ったプログラムを活用したいと思います。詳しい説明は

HSPでF1ポイントのランキングページを生成してみる～その1～【HSP活用術】

以前、（アドレス）でも取り扱ったんですが、今回はもっと手軽に情報を取得できるようにHSPを使って、最終的にはローカルで見られるランキ...

tks-kan.com

2016-09-16 07:45

こちらでどうぞ！

download_urlを変更する

以前に調べていたグランプリ結果のurl（ここでは2016年のオーストラリアグランプリの結果）をdownload_urlに設定します。

https://www.formula1.com/en/results.html/2016/races/938/australia/race-result.html

データを格納する入れ物を追加

新たに

着順
カーナンバー
周回数
タイム（タイム差、ラップ差、リタイア）

この4つを格納する入れ物を用意します。それぞれ以下のような感じにしました。

sdim position,   40, 40
sdim carnumber,  40, 40
sdim laps,       40, 40
sdim times,      40, 40

データ抽出部分の追加・変更

今回の最大のポイントであるデータ抽出部分の追加や変更です。前回調べた通り

<tr>

    <td class="limiter"></td>
    <td class="dark">1</td>
    <td class="dark hide-for-mobile">6</td>
    <td class="dark bold">
        <span class="hide-for-tablet">Nico</span>
        <span class="hide-for-mobile">Rosberg</span>
        <span class="uppercase hide-for-desktop">ROS</span>
    </td>
    <td class="semi-bold uppercase hide-for-tablet">Mercedes</td>
    <td class="bold hide-for-mobile">57</td>
    <td class="dark bold">1:48:15.565</td>
    <td class="bold">25</td>
    <td class="limiter"></td>
</tr>

trタグで囲まれた中身にに必要なデータが入ってます。また、条件としては

trタグに囲まれた範囲である
着順：「”dark”」
カーナンバー：「dark hide-for-mobile」
ドライバー名：「”hide-for-tablet”」「”hide-for-mobile”」「”uppercase hide-for-desktop”」
ラップ数：「bold hide-for-mobile」
タイム：「dark bold」に加えて後ろに文字があるか否かで判定
ポイント：「”bold”」

という風に考えてました。これをさらに正確に合致する条件として書き直すと

trタグに囲まれた範囲である
着順：<td class=”dark”>がある事
カーナンバー：<td class=”dark hide-for-mobile”>がある事
ドライバー名：ファーストネームは<span class=”hide-for-tablet”>、ファミリーネームは<span class=”hide-for-mobile”>がある事
ラップ数：<td class=”bold hide-for-mobile”>がある事
タイム（タイム差、ラップ差、リタイア）：<td class=”dark bold”>があり、さらに</td>もある事
ポイント：<td class=”bold”>がある事

実際にはtrタグで囲まれたという判定部分は省略可能っぽいですね。この条件でデータを取り出していきたいと思います。hspでの判定方法はinstrを使います（詳しくはここ）。

実際にコードにしてみる

そうそう。あと必要なのは

それぞれのデータ格納回数を数えるカウンターの用意

でした。では、上の条件を踏まえた上で、実際に書いてみましょう。

lfcc url_pagename
notesel htmlfile
noteload url_pagename
first_cnt = 0 : family_cnt = 0 : country_cnt = 0 : teamname_cnt = 0
//追加したカウンター
position_cnt = 0 : carnumber_cnt = 0 : laps_cnt = 0 : times_cnt = 0
repeat notemax
    noteget text_line, cnt
    // 着順
    if (instr(text_line, 0, "<td class=\"dark\">") ! -1) {
        strrep text_line, "<td class=\"dark\">", ""
        strrep text_line, "</td>", ""
        strrep text_line, " ", ""
        position(position_cnt) = text_line
        position_cnt++
        continue
    }
    // カーナンバー
    if (instr(text_line, 0, "<td class=\"dark hide-for-mobile\">") ! -1) {
        strrep text_line, "<td class=\"dark hide-for-mobile\">", ""
        strrep text_line, "</td>", ""
        strrep text_line, " ", ""
        carnumber(carnumber_cnt) = text_line
        carnumber_cnt++
        continue
    }
    // ファーストネーム
    if (instr(text_line, 0, "<span class=\"hide-for-tablet\">") ! -1) {
        strrep text_line, "<span class=\"hide-for-tablet\">", ""
        strrep text_line, "</span>", ""
        strrep text_line, " ", ""
        firstname(first_cnt) = text_line
        first_cnt++
        continue
    }
    // ファミリーネーム
    if (instr(text_line, 0, "<span class=\"hide-for-mobile\">") ! -1) {
        strrep text_line, "<span class=\"hide-for-mobile\">", ""
        strrep text_line, "</span>", ""
        strrep text_line, " ", ""
        familyname(family_cnt) = text_line
        family_cnt++
        continue
    }
    // 所属チーム
    if (instr(text_line, 0, "semi-bold uppercase hide-for-tablet") ! -1) {
        split text_line, ">", buf
        split buf(1), "<", result
        teamname(teamname_cnt) = result(0)
        teamname_cnt++
        continue
    }
    // ラップ
    if (instr(text_line, 0, "<td class=\"bold hide-for-mobile\">") ! -1) {
        strrep text_line, "<td class=\"bold hide-for-mobile\">", ""
        strrep text_line, "</td>", ""
        strrep text_line, " ", ""
        laps(laps_cnt) = text_line
        laps_cnt++
        continue
    }
    // タイム
    if (instr(text_line, 0, "<td class=\"dark bold\">") ! -1) and (instr(text_line, 0, "</td>") ! -1) {
        split text_line, ">", buf
        split buf(1), "<", result
        times(times_cnt) = result(0)
        times_cnt++
        continue
    }
    // 獲得ポイント
    if (instr(text_line, 0, "<td class=\"bold\">") ! -1) {
        strrep text_line, "<td class=\"bold\">", ""
        strrep text_line, "</td>", ""
        strrep text_line, " ", ""
        getpoint(point_cnt) = text_line
        point_cnt++
        continue
    }

loop
noteunsel

「strrep」は置換して不要なものを消す作業を、splitは該当する文字で1文を切り分ける作業をして、必要なデータ部分のみを取り出しています。それぞれ

着順→position(数字0～40)
カーナンバー→carnumber(数字0～40)
ファミリーネーム→familyname(数字0～40)
ファーストネーム→firstname(数字0～40)
所属チーム→teamname(数字0～40)
ラップ→laps(数字0～40)
タイム→times(数字0～40)
獲得ポイント→getpoint(数字0～40)

に格納されます。これを使って次回はHTMLを作成しましょう。

まとめ

ひとまず今回はここまで。ここまでのコードを載せておきますね。次回はHTMLとcsv出力をしてみたいと思います。

//HSPモジュール　SAKMISさんのを使用しています
//命令→lfcc ファイルネーム
//読み込み→改行置換→保存
    #module
    #deffunc lfcc str filename
;   mref filename,32
;   mref status,64
        exist filename
        size=strsize
        if size=-1 : status=-1 : return
        sdim ss,size+1,1
        bload filename,ss,size

        ii=0
        code=0
        sdim data,size<<1,1

    repeat size
        tt = peek (ss,cnt)
        if tt=10 : code=10 : break
        if tt=13 {
        code=13
        tt = peek (ss,cnt+1)
        if tt=10 : code=0
        break
        }
    loop

        if code=0 : status=-1 : return

    repeat size
        tt = peek (ss,cnt)
        if tt=code : wpoke data,ii,2573 : ii+2 : continue
        poke data,ii,tt : ii++
    loop

        bsave filename,data,ii
        status=ii
    return
#global

    #include "hspinet.as"

    // ネット接続の確認
    netinit
    if stat : dialog "ネット接続できません" : end

    // 初期設定
    download_url = "https://www.formula1.com/en/results.html/2016/races/938/australia/race-result.html"
    sdim firstname,  40, 40
    sdim familyname, 40, 40
    sdim country,    40, 40
    sdim teamname,   40, 40
    sdim getpoint,   40, 40
    // 追加したもの
    sdim position,   40, 40
    sdim carnumber,  40, 40
    sdim laps,       40, 40
    sdim times,      40, 40


    /* 第一回で作成したダウンロード部分 */

    //　URL分解
    if (instr(download_url, 0, ".html") ! -1) or (instr(download_url, 0, ".php") ! -1) { //.html .phpが含まれているなら
        split download_url, "/", result
        url_pagename = result(stat-1)
        url_address  = download_url
        strrep url_address, url_pagename, ""
    } else { // 含まれていない場合はindex.htmlにする
        url_address  = download_url
        url_pagename = "index.html"
    }

    // チェック用分岐
    goto *skippoint


    neturl url_address
    netrequest url_pagename

    *main
    //取得待ち確認
    netexec res
    if res > 0 : goto *comp
    if res < 0 : goto *bad
    await 50
    goto *main

    *bad
    //エラー
    neterror estr
    mes "ERROR "+estr
    stop

    *comp
    mes "DOWNLOAD 完了"
    stop



    /*データ抽出部分*/
    lfcc url_pagename
    notesel htmlfile
    noteload url_pagename
    first_cnt = 0 : family_cnt = 0 : country_cnt = 0 : teamname_cnt = 0
    position_cnt = 0 : carnumber_cnt = 0 : laps_cnt = 0 : times_cnt = 0
    repeat notemax
        noteget text_line, cnt
        // 着順
        if (instr(text_line, 0, "<td class=\"dark\">") ! -1) {
            strrep text_line, "<td class=\"dark\">", ""
            strrep text_line, "</td>", ""
            strrep text_line, " ", ""
            position(position_cnt) = text_line
            position_cnt++
            continue
        }
        // カーナンバー
        if (instr(text_line, 0, "<td class=\"dark hide-for-mobile\">") ! -1) {
            strrep text_line, "<td class=\"dark hide-for-mobile\">", ""
            strrep text_line, "</td>", ""
            strrep text_line, " ", ""
            carnumber(carnumber_cnt) = text_line
            carnumber_cnt++
            continue
        }
        // ファーストネーム
        if (instr(text_line, 0, "<span class=\"hide-for-tablet\">") ! -1) {
            strrep text_line, "<span class=\"hide-for-tablet\">", ""
            strrep text_line, "</span>", ""
            strrep text_line, " ", ""
            firstname(first_cnt) = text_line
            first_cnt++
            continue
        }
        // ファミリーネーム
        if (instr(text_line, 0, "<span class=\"hide-for-mobile\">") ! -1) {
            strrep text_line, "<span class=\"hide-for-mobile\">", ""
            strrep text_line, "</span>", ""
            strrep text_line, " ", ""
            familyname(family_cnt) = text_line
            family_cnt++
            continue
        }
        // 所属チーム
        if (instr(text_line, 0, "semi-bold uppercase hide-for-tablet") ! -1) {
            split text_line, ">", buf
            split buf(1), "<", result
            teamname(teamname_cnt) = result(0)
            teamname_cnt++
            continue
        }
        // ラップ
        if (instr(text_line, 0, "<td class=\"bold hide-for-mobile\">") ! -1) {
            strrep text_line, "<td class=\"bold hide-for-mobile\">", ""
            strrep text_line, "</td>", ""
            strrep text_line, " ", ""
            laps(laps_cnt) = text_line
            laps_cnt++
            continue
        }
        // タイム
        if (instr(text_line, 0, "<td class=\"dark bold\">") ! -1) and (instr(text_line, 0, "</td>") ! -1) {
            split text_line, ">", buf
            split buf(1), "<", result
            times(times_cnt) = result(0)
            times_cnt++
            continue
        }
        // 獲得ポイント
        if (instr(text_line, 0, "<td class=\"bold\">") ! -1) {
            strrep text_line, "<td class=\"bold\">", ""
            strrep text_line, "</td>", ""
            strrep text_line, " ", ""
            getpoint(point_cnt) = text_line
            point_cnt++
            continue
        }

    loop
    noteunsel
    stop