現役プログラマのWordPressカスタマイズ相談

WordPress(ワードプレス)のお悩み、うまくいかなくてお困りなこと、不具合調査、新規制作依頼まで、ウェブアプリケーションエンジニアがあなたをサポートします。

ひらがな・カタカナのPHP配列(ランダム/乱数生成用)

f:id:jsaz:20171220234432j:plain

ダミーデータの生成のため、ひらがなでランダムな文字列を作る必要があっていろいろと検索してみたけど、あまりヒットしなかったので自分で作ることにしました。
 

そこで「ひらがな(カタカナも)」一覧をPHPの配列(1字ずつダブルクォーテーションで囲って)を書くためコピペで使えるサイトを探したのですが、こちらもなかなか見つかりませんでした。
ぱっとコピペできるようにひらがな一覧、カタカナ一覧を載せておきます。

ひらがな/カタカナの一覧

まずこれが見つけにくかった。。。 あ、い、う、、、と自分で入力していくことを覚悟しましたが、一覧で表示されているサイトをみつけました!

全角ひらがな/カタカナ/英数
http://tagnoheya.com/charlist/charlist2.html
 

テキストボックスに「あ~ん」があるので、そのままコピペで利用できます。 下記の通り文字列として用意できました。

  • ひらがな
$str_hiragana = "ぁあぃいぅうぇえぉおかがきぎくぐけげこごさざしじすずせぜそぞただちぢっつづてでとどなにぬねのはばぱひびぴふぶぷへべぺほぼぽまみむめもゃやゅゆょよらりるれろゎわゐゑをん";
  • カタカナ
$str_katakana = "ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチヂッツヅテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロヮワヰヱヲンヴヵヶ";

特殊なかな文字?ゐゑヰヱが不要な方はこの時点で除去してください。表示や保存、エクスポートなど後々文字コード云々で不具合が出るかもしれません。(←面倒だからという意味で)

ところで小さい「ゎ」ってなんであるんでしょうね。
ギャル文字くらいでしか使われているの見たことないな(笑

str_split 関数を用いて文字列を配列に変換する

さて、次にひらがな(カタカナ)文字列を1字ずつ配列にします。str_splitを使うと一発!という想定だったのですが日本語(マルチバイト)対応していませんでした。結構長い時間ハマりました。。。
 

と、いうことで トリッキーコードネット さんの『マルチバイト文字列を指定文字数毎に分解』を参考にさせてもらいました。

tricky-code.net

  • ひらがな文字列を1字ずつの配列に変換
<?php
$array_hiragana = mb_str_split($str_hiragana);


function mb_str_split($str, $split_len = 1) {
    mb_internal_encoding('UTF-8');
    mb_regex_encoding('UTF-8');
    if ($split_len <= 0) $split_len = 1;

    $strlen = mb_strlen($str, 'UTF-8');
    $ret    = array();
    for ($i = 0; $i < $strlen; $i += $split_len) {
        $ret[ ] = mb_substr($str, $i, $split_len);
    }
    return $ret;
}

$array_hiragana にはひらがな1文字ずつの配列が入ってきます。

配列定義出力用にダブルクォーテーションで囲む

<?php
$result_hiragana = implode('","', $array_hiragana);

PHP implode で配列データをダブルクォートで囲む – HARD DAY'S NIGHT ブログ

この状態だと未完成なので先頭と末尾にダブルクォーテーションを追加します。 (次のステップ)

ひらがな/カタカナの配列定義出力プログラム

まとめるとこんな感じです。

<?php
$str_hiragana = "ぁあぃいぅうぇえぉおかがきぎくぐけげこごさざしじすずせぜそぞただちぢっつづてでとどなにぬねのはばぱひびぴふぶぷへべぺほぼぽまみむめもゃやゅゆょよらりるれろゎわゐゑをん";
$array_hiragana = mb_str_split($str_hiragana);
$result_hiragana = implode('","', $array_hiragana);

echo "\$hiragana = [\"" . $result_hiragana . "\"];\n";
echo "\n";

$str_katakana = "ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチヂッツヅテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロヮワヰヱヲンヴヵヶ";
$array_katakana = mb_str_split($str_katakana);
$result_katakana = implode('","', $array_katakana);

echo "\$katakana = [\"" . $result_katakana . "\"];\n";

function mb_str_split($str, $split_len = 1) {
    mb_internal_encoding('UTF-8');
    mb_regex_encoding('UTF-8');
    if ($split_len <= 0) $split_len = 1;

    $strlen = mb_strlen($str, 'UTF-8');
    $ret    = array();
    for ($i = 0; $i < $strlen; $i += $split_len) {
        $ret[ ] = mb_substr($str, $i, $split_len);
    }
    return $ret;
}

結果

PHPでの利用の想定で定義していますが、javapython, swift等でも若干の修正で利用できるかと思います。

注意)「["ぁ","あ","ぃ"...]」のカッコは「array("ぁ","あ","ぃ"...)」と同じ意味です。

【出力結果】PHPひらがな一覧の配列定義

$hiragana = ["ぁ","あ","ぃ","い","ぅ","う","ぇ","え","ぉ","お",
            "か","が","き","ぎ","く","ぐ","け","げ","こ","ご",
            "さ","ざ","し","じ","す","ず","せ","ぜ","そ","ぞ",
            "た","だ","ち","ぢ","っ","つ","づ","て","で","と","ど",
            "な","に","ぬ","ね","の","は","ば","ぱ",
            "ひ","び","ぴ","ふ","ぶ","ぷ","へ","べ","ぺ","ほ","ぼ","ぽ",
            "ま","み","む","め","も","ゃ","や","ゅ","ゆ","ょ","よ",
            "ら","り","る","れ","ろ","ゎ","わ","ゐ","ゑ","を","ん"];

【出力結果】PHPカタカナ一覧の配列定義

$katakana = ["ァ","ア","ィ","イ","ゥ","ウ","ェ","エ","ォ","オ",
            "カ","ガ","キ","ギ","ク","グ","ケ","ゲ","コ","ゴ",
            "サ","ザ","シ","ジ","ス","ズ","セ","ゼ","ソ","ゾ",
            "タ","ダ","チ","ヂ","ッ","ツ","ヅ","テ","デ","ト","ド",
            "ナ","ニ","ヌ","ネ","ノ","ハ","バ","パ",
            "ヒ","ビ","ピ","フ","ブ","プ","ヘ","ベ","ペ","ホ","ボ","ポ",
            "マ","ミ","ム","メ","モ","ャ","ヤ","ュ","ユ","ョ","ヨ",
            "ラ","リ","ル","レ","ロ","ヮ","ワ","ヰ","ヱ","ヲ","ン","ヴ","ヵ","ヶ"];

ひらがな/かたかなの乱数生成

さて、これでやっと準備ができたのでひらがなの乱数生成プログラムはこちらを参考にさせていただきます。

qiita.com

<?php
//ひらがな - ランダム文字列
function getRandomHiragana($length = 5) {
    $hiragana = ["","","","","","","","","","",
        "","","","","","","","","","",
        "","","","","","","","","","",
        "","","","","","","","","","","",
        "","","","","","","","",
        "","","","","","","","","","","","",
        "","","","","","","","","","","",
        "","","","","","","","","","",""];
    $r_str = null;
    for ($i = 0; $i < $length; $i++) {
        $r_str .= $hiragana[rand(0, count($hiragana) - 1)];
    }
    return $r_str;
}
//カタカナ - ランダム文字列
function getRandomKatakana($length = 5) {
    $katakana = ["","","","","","","","","","",
        "","","","","","","","","","",
        "","","","","","","","","","",
        "","","","","","","","","","","",
        "","","","","","","","",
        "","","","","","","","","","","","",
        "","","","","","","","","","","",
        "","","","","","","","","","","","","",""];
    $r_str = null;
    for ($i = 0; $i < $length; $i++) {
        $r_str .= $katakana[rand(0, count($katakana) - 1)];
    }
    return $r_str;
}

たまーに使うメソッドだと思いますので是非ブックマーク登録してくださいね。

ちなみにダミーデータ生成用に作った(かなりの時間をかけた)このプログラムですが、laravelのfactoryでダミーデータを作る際の名前のヨミで使う予定でした。
が、fakerのロケールを日本にするとダミーの名前をひらがなでもカタカナでも設定できることがわかり日の目をみることはありませんでした。。。

laravelのfactoryのfakerの日本語設定はいつか書きたいと思います。。。