Copied!

🇷🇺 Русским гражданам

В Украине сейчас идет война. Силами РФ наносятся удары по гражданской инфраструктуре в [Харькове][1], [Киеве][2], [Чернигове][3], [Сумах][4], [Ирпене][5] и десятках других городов. Гибнут люди - и гражданское население, и военные, в том числе российские призывники, которых бросили воевать. Чтобы лишить собственный народ доступа к информации, правительство РФ запретило называть войну войной, закрыло независимые СМИ и принимает сейчас ряд диктаторских законов. Эти законы призваны заткнуть рот всем, кто против войны. За обычный призыв к миру сейчас можно получить несколько лет тюрьмы.

Не молчите! Молчание - знак вашего согласия с политикой российского правительства. Вы можете сделать выбор НЕ МОЛЧАТЬ.

🇺🇸 To people of Russia

There is a war in Ukraine right now. The forces of the Russian Federation are attacking civilian infrastructure in [Kharkiv][1], [Kyiv][2], [Chernihiv][3], [Sumy][4], [Irpin][5] and dozens of other cities. People are dying – both civilians and military servicemen, including Russian conscripts who were thrown into the fighting. In order to deprive its own people of access to information, the government of the Russian Federation has forbidden calling a war a war, shut down independent media and is passing a number of dictatorial laws. These laws are meant to silence all those who are against war. You can be jailed for multiple years for simply calling for peace. Do not be silent! Silence is a sign that you accept the Russian government's policy. You can choose NOT TO BE SILENT.

  • psalm-immutable
CloneableFinalInstantiable
Constants
public voku\helper\ASCII::AMHARIC_LANGUAGE_CODE = 'am'
public voku\helper\ASCII::ARABIC_LANGUAGE_CODE = 'ar'
public voku\helper\ASCII::ARMENIAN_LANGUAGE_CODE = 'hy'
public voku\helper\ASCII::AZERBAIJANI_LANGUAGE_CODE = 'az'
public voku\helper\ASCII::BELARUSIAN_LANGUAGE_CODE = 'be'
public voku\helper\ASCII::BENGALI_LANGUAGE_CODE = 'bn'
public voku\helper\ASCII::BULGARIAN_LANGUAGE_CODE = 'bg'
public voku\helper\ASCII::CHINESE_LANGUAGE_CODE = 'zh'
public voku\helper\ASCII::CROATIAN_LANGUAGE_CODE = 'hr'
public voku\helper\ASCII::CZECH_LANGUAGE_CODE = 'cs'
public voku\helper\ASCII::DANISH_LANGUAGE_CODE = 'da'
public voku\helper\ASCII::DUTCH_LANGUAGE_CODE = 'nl'
public voku\helper\ASCII::ENGLISH_LANGUAGE_CODE = 'en'
public voku\helper\ASCII::ESPERANTO_LANGUAGE_CODE = 'eo'
public voku\helper\ASCII::ESTONIAN_LANGUAGE_CODE = 'et'
public voku\helper\ASCII::EXTRA_LATIN_CHARS_LANGUAGE_CODE = 'latin'
public voku\helper\ASCII::EXTRA_MSWORD_CHARS_LANGUAGE_CODE = 'msword'
public voku\helper\ASCII::EXTRA_WHITESPACE_CHARS_LANGUAGE_CODE = ' '
public voku\helper\ASCII::FINNISH_LANGUAGE_CODE = 'fi'
public voku\helper\ASCII::FRENCH_AUSTRIAN_LANGUAGE_CODE = 'fr_at'
public voku\helper\ASCII::FRENCH_LANGUAGE_CODE = 'fr'
public voku\helper\ASCII::FRENCH_SWITZERLAND_LANGUAGE_CODE = 'fr_ch'
public voku\helper\ASCII::GEORGIAN_LANGUAGE_CODE = 'ka'
public voku\helper\ASCII::GERMAN_AUSTRIAN_LANGUAGE_CODE = 'de_at'
public voku\helper\ASCII::GERMAN_LANGUAGE_CODE = 'de'
public voku\helper\ASCII::GERMAN_SWITZERLAND_LANGUAGE_CODE = 'de_ch'
public voku\helper\ASCII::GREEK_LANGUAGE_CODE = 'el'
public voku\helper\ASCII::GREEKLISH_LANGUAGE_CODE = 'el__greeklish'
public voku\helper\ASCII::HINDI_LANGUAGE_CODE = 'hi'
public voku\helper\ASCII::HUNGARIAN_LANGUAGE_CODE = 'hu'
public voku\helper\ASCII::ITALIAN_LANGUAGE_CODE = 'it'
public voku\helper\ASCII::JAPANESE_LANGUAGE_CODE = 'ja'
public voku\helper\ASCII::KAZAKH_LANGUAGE_CODE = 'kk'
public voku\helper\ASCII::KIRGHIZ_LANGUAGE_CODE = 'ky'
public voku\helper\ASCII::KOREAN_LANGUAGE_CODE = 'ko'
public voku\helper\ASCII::LATVIAN_LANGUAGE_CODE = 'lv'
public voku\helper\ASCII::LITHUANIAN_LANGUAGE_CODE = 'lt'
public voku\helper\ASCII::MACEDONIAN_LANGUAGE_CODE = 'mk'
public voku\helper\ASCII::MONGOLIAN_LANGUAGE_CODE = 'mn'
public voku\helper\ASCII::MYANMAR_LANGUAGE_CODE = 'my'
public voku\helper\ASCII::NORWEGIAN_LANGUAGE_CODE = 'no'
public voku\helper\ASCII::ORIYA_LANGUAGE_CODE = 'or'
public voku\helper\ASCII::PASHTO_LANGUAGE_CODE = 'ps'
public voku\helper\ASCII::PERSIAN_LANGUAGE_CODE = 'fa'
public voku\helper\ASCII::POLISH_LANGUAGE_CODE = 'pl'
public voku\helper\ASCII::PORTUGUESE_LANGUAGE_CODE = 'pt'
public voku\helper\ASCII::ROMANIAN_LANGUAGE_CODE = 'ro'
public voku\helper\ASCII::RUSSIAN_GOST_2000_B_LANGUAGE_CODE = 'ru__gost_2000_b'
public voku\helper\ASCII::RUSSIAN_LANGUAGE_CODE = 'ru'
public voku\helper\ASCII::RUSSIAN_PASSPORT_2013_LANGUAGE_CODE = 'ru__passport_2013'
public voku\helper\ASCII::SERBIAN_CYRILLIC_LANGUAGE_CODE = 'sr__cyr'
public voku\helper\ASCII::SERBIAN_LANGUAGE_CODE = 'sr'
public voku\helper\ASCII::SERBIAN_LATIN_LANGUAGE_CODE = 'sr__lat'
public voku\helper\ASCII::SLOVAK_LANGUAGE_CODE = 'sk'
public voku\helper\ASCII::SWEDISH_LANGUAGE_CODE = 'sv'
public voku\helper\ASCII::THAI_LANGUAGE_CODE = 'th'
public voku\helper\ASCII::TURKISH_LANGUAGE_CODE = 'tr'
public voku\helper\ASCII::TURKMEN_LANGUAGE_CODE = 'tk'
public voku\helper\ASCII::UKRAINIAN_LANGUAGE_CODE = 'uk'
public voku\helper\ASCII::UZBEK_LANGUAGE_CODE = 'uz'
public voku\helper\ASCII::VIETNAMESE_LANGUAGE_CODE = 'vi'
Methods
public static charsArray(bool $replace_extra_symbols = false) : array
 

Returns an replacement array for ASCII methods.

EXAMPLE: $array = ASCII::charsArray(); var_dump($array['ru']['б']); // 'b'

  • psalm-suppress InvalidNullableReturnType - we use the prepare* methods here, so we don't get NULL here
  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • psalm-pure
  • return array
  • phpstan-return array<string, array<string , string>>
public static charsArrayWithMultiLanguageValues(bool $replace_extra_symbols = false) : array
 

Returns an replacement array for ASCII methods with a mix of multiple languages.

EXAMPLE: $array = ASCII::charsArrayWithMultiLanguageValues(); var_dump($array['b']); // ['β', 'б', 'ဗ', 'ბ', 'ب']

  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • psalm-pure
  • return array
  • phpstan-return array<string, array<int, string>>
public static charsArrayWithOneLanguage(string $language = 'en'self::ENGLISH_LANGUAGE_CODE, bool $replace_extra_symbols = false, bool $asOrigReplaceArray = true) : array
 

Returns an replacement array for ASCII methods with one language.

For example, German will map 'ä' to 'ae', while other languages will simply return e.g. 'a'.

EXAMPLE: $array = ASCII::charsArrayWithOneLanguage('ru'); $tmpKey = \array_search('yo', $array['replace']); echo $array['orig'][$tmpKey]; // 'ё'

  • psalm-suppress InvalidNullableReturnType - we use the prepare* methods here, so we don't get NULL here
  • param string $language [optional] Language of the source string e.g.: en, de_at, or de-ch. (default is 'en') | ASCII::*_LANGUAGE_CODE
  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • param bool $asOrigReplaceArray [optional] TRUE === return {orig: string[], replace: string[]} array
  • psalm-pure
  • return array
  • phpstan-param ASCII::*_LANGUAGE_CODE $language
  • phpstan-return array{orig: string[], replace: string[]}|array<string, string>
public static charsArrayWithSingleLanguageValues(bool $replace_extra_symbols = false, bool $asOrigReplaceArray = true) : array
 

Returns an replacement array for ASCII methods with multiple languages.

EXAMPLE: $array = ASCII::charsArrayWithSingleLanguageValues(); $tmpKey = \array_search('hnaik', $array['replace']); echo $array['orig'][$tmpKey]; // '၌'

  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • param bool $asOrigReplaceArray [optional] TRUE === return {orig: string[], replace: string[]} array
  • psalm-pure
  • return array
  • phpstan-return array{orig: string[], replace: string[]}|array<string, string>
public static clean(string $str, bool $normalize_whitespace = true, bool $keep_non_breaking_space = false, bool $normalize_msword = true, bool $remove_invisible_characters = true) : string
 

Accepts a string and removes all non-UTF-8 characters from it + extras if needed.

  • param string $str
  • param bool $normalize_whitespace [optional] Set to true, if you need to normalize the whitespace.
  • param bool $normalize_msword [optional] Set to true, if you need to normalize MS Word chars e.g.: "…" => "..."
  • param bool $keep_non_breaking_space [optional] <p>Set to true, to keep non-breaking-spaces, in combination with $normalize_whitespace
  • param bool $remove_invisible_characters [optional] Set to false, if you not want to remove invisible characters e.g.: "\0"
  • psalm-pure
  • return string
public static getAllLanguages() : array
 

Get all languages from the constants "ASCII::.*LANGUAGE_CODE".

  • return string[]
  • phpstan-return array<string, string>
public static is_ascii(string $str) : bool
 

Checks if a string is 7 bit ASCII.

EXAMPLE: ASCII::is_ascii('白'); // false

  • param string $str
  • psalm-pure
  • return bool
public static normalize_msword(string $str) : string
 

Returns a string with smart quotes, ellipsis characters, and dashes from Windows-1252 (commonly used in Word documents) replaced by their ASCII equivalents.

EXAMPLE: ASCII::normalize_msword('„Abcdef…”'); // '"Abcdef..."'

  • param string $str
  • psalm-pure
  • return string
public static normalize_whitespace(string $str, bool $keepNonBreakingSpace = false, bool $keepBidiUnicodeControls = false, bool $normalize_control_characters = false) : string
 

Normalize the whitespace.

EXAMPLE: ASCII::normalize_whitespace("abc-\xc2\xa0-öäü-\xe2\x80\xaf-\xE2\x80\xAC", true); // "abc-\xc2\xa0-öäü- -"

  • param string $str
  • param bool $keepNonBreakingSpace [optional] Set to true, to keep non-breaking-spaces.
  • param bool $keepBidiUnicodeControls [optional] Set to true, to keep non-printable (for the web) bidirectional text chars.
  • param bool $normalize_control_characters [optional] Set to true, to convert e.g. LINE-, PARAGRAPH-SEPARATOR with "\n" and LINE TABULATION with "\t".
  • psalm-pure
  • return string
public static remove_invisible_characters(string $str, bool $url_encoded = false, string $replacement = '', bool $keep_basic_control_characters = true) : string
 

Remove invisible characters from a string.

e.g.: This prevents sandwiching null characters between ascii characters, like Java\0script.

copy&past from https://github.com/bcit-ci/CodeIgniter/blob/develop/system/core/Common.php

  • param string $str
  • param bool $url_encoded
  • param string $replacement
  • param bool $keep_basic_control_characters
  • psalm-pure
  • return string
public static to_ascii(string $str, string $language = 'en'self::ENGLISH_LANGUAGE_CODE, bool $remove_unsupported_chars = true, bool $replace_extra_symbols = false, bool $use_transliterate = false, ?bool $replace_single_chars_only = NULL) : string
 

Returns an ASCII version of the string. A set of non-ASCII characters are replaced with their closest ASCII counterparts, and the rest are removed by default. The language or locale of the source string can be supplied for language-specific transliteration in any of the following formats: en, en_GB, or en-GB. For example, passing "de" results in "äöü" mapping to "aeoeue" rather than "aou" as in other languages.

EXAMPLE: ASCII::to_ascii('�Düsseldorf�', 'en'); // Dusseldorf

  • param string $str
  • param string $language [optional] Language of the source string. (default is 'en') | ASCII::*_LANGUAGE_CODE
  • param bool $remove_unsupported_chars [optional] Whether or not to remove the unsupported characters.
  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • param bool $use_transliterate [optional] Use ASCII::to_transliterate() for unknown chars.
  • param bool|null $replace_single_chars_only [optional] Single char replacement is better for the performance, but some languages need to replace more then one char at the same time. | NULL === auto-setting, depended on the language
  • psalm-pure
  • return string
  • phpstan-param ASCII::*_LANGUAGE_CODE $language
public static to_ascii_remap(string $str1, string $str2) : array
 

WARNING: This method will return broken characters and is only for special cases.

Convert two UTF-8 encoded string to a single-byte strings suitable for functions that need the same string length after the conversion.

The function simply uses (and updates) a tailored dynamic encoding (in/out map parameter) where non-ascii characters are remapped to the range [128-255] in order of appearance.

  • param string $str1
  • param string $str2
  • return string[]
  • phpstan-return array{0: string, 1: string}
public static to_filename(string $str, bool $use_transliterate = true, string $fallback_char = '-') : string
 

Convert given string to safe filename (and keep string case).

EXAMPLE: ASCII::to_filename('שדגשדג.png', true)); // 'shdgshdg.png'

  • param string $str
  • param bool $use_transliterate
  • param string $fallback_char
  • psalm-pure
  • return string
public static to_slugify(string $str, string $separator = '-', string $language = 'en'self::ENGLISH_LANGUAGE_CODE, array $replacements = [], bool $replace_extra_symbols = false, bool $use_str_to_lower = true, bool $use_transliterate = false) : string
 

Converts the string into an URL slug. This includes replacing non-ASCII characters with their closest ASCII equivalents, removing remaining non-ASCII and non-alphanumeric characters, and replacing whitespace with $separator. The separator defaults to a single dash, and the string is also converted to lowercase. The language of the source string can also be supplied for language-specific transliteration.

  • param string $str
  • param string $separator [optional] The string used to replace whitespace.
  • param string $language [optional] Language of the source string. (default is 'en') | ASCII::*_LANGUAGE_CODE
  • param array<string,string> $replacements [optional] A map of replaceable strings.
  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • param bool $use_str_to_lower [optional] Use "string to lower" for the input.
  • param bool $use_transliterate [optional] Use ASCII::to_transliterate() for unknown chars.
  • psalm-pure
  • return string
  • phpstan-param ASCII::*_LANGUAGE_CODE $language
public static to_transliterate(string $str, $unknown = '?', bool $strict = false) : string
 

Returns an ASCII version of the string. A set of non-ASCII characters are replaced with their closest ASCII counterparts, and the rest are removed unless instructed otherwise.

EXAMPLE: ASCII::to_transliterate('déjà σσς iıii'); // 'deja sss iiii'

  • param string $str
  • param string|null $unknown [optional] Character use if character unknown. (default is '?') But you can also use NULL to keep the unknown chars.
  • param bool $strict [optional] Use "transliterator_transliterate()" from PHP-Intl
  • psalm-pure
  • return string
  • noinspection ParameterDefaultValueIsNotNullInspection
Properties
private static $ASCII_EXTRAS = NULL
 
  • var array<string,array<string,string>>|null
private static $ASCII_MAPS = NULL
 
  • var array<string,array<string,string>>|null
private static $ASCII_MAPS_AND_EXTRAS = NULL
 
  • var array<string,array<string,string>>|null
private static $BIDI_UNI_CODE_CONTROLS_TABLE = [8234 => '‪', 8235 => '‫', 8236 => '‬', 8237 => '‭', 8238 => '‮', 8294 => '⁦', 8295 => '⁧', 8296 => '⁨', 8297 => '⁩']
 

bidirectional text chars

  • var array<int,string>
private static $LANGUAGE_MAX_KEY = NULL
 
  • var array<string,int>|null
private static $ORD = NULL
 
  • var array<string,int>|null
private static $REGEX_ASCII = '[^  -~]'
Methods
private static get_language(string $language)
 

Get the language from a string.

e.g.: de_at -> de_at de_DE -> de DE_DE -> de de-de -> de

  • noinspection ReturnTypeCanBeDeclaredInspection
  • param string $language
  • psalm-pure
  • return string
private static getData(string $file)
 

Get data from "/data/*.php".

  • noinspection ReturnTypeCanBeDeclaredInspection
  • param string $file
  • psalm-pure
  • return array
private static getDataIfExists(string $file) : array
 

Get data from "/data/*.php".

  • param string $file
  • psalm-pure
  • return array
private static prepareAsciiAndExtrasMaps()
 
  • psalm-pure
  • return void
private static prepareAsciiExtras()
 
  • psalm-pure
  • return void
private static prepareAsciiMaps()
 
  • psalm-pure
  • return void
private static to_ascii_remap_intern(string $str, array $map) : string
 

WARNING: This method will return broken characters and is only for special cases.

Convert a UTF-8 encoded string to a single-byte string suitable for functions that need the same string length after the conversion.

The function simply uses (and updates) a tailored dynamic encoding (in/out map parameter) where non-ascii characters are remapped to the range [128-255] in order of appearance.

Thus, it supports up to 128 different multibyte code points max over the whole set of strings sharing this encoding.

Source: https://github.com/KEINOS/mb_levenshtein

  • param string $str
  • param array $map
  • return string
  • phpstan-param array<string, string> $map
Properties
private static $ASCII_EXTRAS = NULL
 
  • var array<string,array<string,string>>|null
private static $ASCII_MAPS = NULL
 
  • var array<string,array<string,string>>|null
private static $ASCII_MAPS_AND_EXTRAS = NULL
 
  • var array<string,array<string,string>>|null
private static $BIDI_UNI_CODE_CONTROLS_TABLE = [8234 => '‪', 8235 => '‫', 8236 => '‬', 8237 => '‭', 8238 => '‮', 8294 => '⁦', 8295 => '⁧', 8296 => '⁨', 8297 => '⁩']
 

bidirectional text chars

  • var array<int,string>
private static $LANGUAGE_MAX_KEY = NULL
 
  • var array<string,int>|null
private static $ORD = NULL
 
  • var array<string,int>|null
private static $REGEX_ASCII = '[^  -~]'
Methods
public static charsArray(bool $replace_extra_symbols = false) : array
 

Returns an replacement array for ASCII methods.

EXAMPLE: $array = ASCII::charsArray(); var_dump($array['ru']['б']); // 'b'

  • psalm-suppress InvalidNullableReturnType - we use the prepare* methods here, so we don't get NULL here
  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • psalm-pure
  • return array
  • phpstan-return array<string, array<string , string>>
public static charsArrayWithMultiLanguageValues(bool $replace_extra_symbols = false) : array
 

Returns an replacement array for ASCII methods with a mix of multiple languages.

EXAMPLE: $array = ASCII::charsArrayWithMultiLanguageValues(); var_dump($array['b']); // ['β', 'б', 'ဗ', 'ბ', 'ب']

  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • psalm-pure
  • return array
  • phpstan-return array<string, array<int, string>>
public static charsArrayWithOneLanguage(string $language = 'en'self::ENGLISH_LANGUAGE_CODE, bool $replace_extra_symbols = false, bool $asOrigReplaceArray = true) : array
 

Returns an replacement array for ASCII methods with one language.

For example, German will map 'ä' to 'ae', while other languages will simply return e.g. 'a'.

EXAMPLE: $array = ASCII::charsArrayWithOneLanguage('ru'); $tmpKey = \array_search('yo', $array['replace']); echo $array['orig'][$tmpKey]; // 'ё'

  • psalm-suppress InvalidNullableReturnType - we use the prepare* methods here, so we don't get NULL here
  • param string $language [optional] Language of the source string e.g.: en, de_at, or de-ch. (default is 'en') | ASCII::*_LANGUAGE_CODE
  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • param bool $asOrigReplaceArray [optional] TRUE === return {orig: string[], replace: string[]} array
  • psalm-pure
  • return array
  • phpstan-param ASCII::*_LANGUAGE_CODE $language
  • phpstan-return array{orig: string[], replace: string[]}|array<string, string>
public static charsArrayWithSingleLanguageValues(bool $replace_extra_symbols = false, bool $asOrigReplaceArray = true) : array
 

Returns an replacement array for ASCII methods with multiple languages.

EXAMPLE: $array = ASCII::charsArrayWithSingleLanguageValues(); $tmpKey = \array_search('hnaik', $array['replace']); echo $array['orig'][$tmpKey]; // '၌'

  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • param bool $asOrigReplaceArray [optional] TRUE === return {orig: string[], replace: string[]} array
  • psalm-pure
  • return array
  • phpstan-return array{orig: string[], replace: string[]}|array<string, string>
public static clean(string $str, bool $normalize_whitespace = true, bool $keep_non_breaking_space = false, bool $normalize_msword = true, bool $remove_invisible_characters = true) : string
 

Accepts a string and removes all non-UTF-8 characters from it + extras if needed.

  • param string $str
  • param bool $normalize_whitespace [optional] Set to true, if you need to normalize the whitespace.
  • param bool $normalize_msword [optional] Set to true, if you need to normalize MS Word chars e.g.: "…" => "..."
  • param bool $keep_non_breaking_space [optional] <p>Set to true, to keep non-breaking-spaces, in combination with $normalize_whitespace
  • param bool $remove_invisible_characters [optional] Set to false, if you not want to remove invisible characters e.g.: "\0"
  • psalm-pure
  • return string
private static get_language(string $language)
 

Get the language from a string.

e.g.: de_at -> de_at de_DE -> de DE_DE -> de de-de -> de

  • noinspection ReturnTypeCanBeDeclaredInspection
  • param string $language
  • psalm-pure
  • return string
public static getAllLanguages() : array
 

Get all languages from the constants "ASCII::.*LANGUAGE_CODE".

  • return string[]
  • phpstan-return array<string, string>
private static getData(string $file)
 

Get data from "/data/*.php".

  • noinspection ReturnTypeCanBeDeclaredInspection
  • param string $file
  • psalm-pure
  • return array
private static getDataIfExists(string $file) : array
 

Get data from "/data/*.php".

  • param string $file
  • psalm-pure
  • return array
public static is_ascii(string $str) : bool
 

Checks if a string is 7 bit ASCII.

EXAMPLE: ASCII::is_ascii('白'); // false

  • param string $str
  • psalm-pure
  • return bool
public static normalize_msword(string $str) : string
 

Returns a string with smart quotes, ellipsis characters, and dashes from Windows-1252 (commonly used in Word documents) replaced by their ASCII equivalents.

EXAMPLE: ASCII::normalize_msword('„Abcdef…”'); // '"Abcdef..."'

  • param string $str
  • psalm-pure
  • return string
public static normalize_whitespace(string $str, bool $keepNonBreakingSpace = false, bool $keepBidiUnicodeControls = false, bool $normalize_control_characters = false) : string
 

Normalize the whitespace.

EXAMPLE: ASCII::normalize_whitespace("abc-\xc2\xa0-öäü-\xe2\x80\xaf-\xE2\x80\xAC", true); // "abc-\xc2\xa0-öäü- -"

  • param string $str
  • param bool $keepNonBreakingSpace [optional] Set to true, to keep non-breaking-spaces.
  • param bool $keepBidiUnicodeControls [optional] Set to true, to keep non-printable (for the web) bidirectional text chars.
  • param bool $normalize_control_characters [optional] Set to true, to convert e.g. LINE-, PARAGRAPH-SEPARATOR with "\n" and LINE TABULATION with "\t".
  • psalm-pure
  • return string
private static prepareAsciiAndExtrasMaps()
 
  • psalm-pure
  • return void
private static prepareAsciiExtras()
 
  • psalm-pure
  • return void
private static prepareAsciiMaps()
 
  • psalm-pure
  • return void
public static remove_invisible_characters(string $str, bool $url_encoded = false, string $replacement = '', bool $keep_basic_control_characters = true) : string
 

Remove invisible characters from a string.

e.g.: This prevents sandwiching null characters between ascii characters, like Java\0script.

copy&past from https://github.com/bcit-ci/CodeIgniter/blob/develop/system/core/Common.php

  • param string $str
  • param bool $url_encoded
  • param string $replacement
  • param bool $keep_basic_control_characters
  • psalm-pure
  • return string
public static to_ascii(string $str, string $language = 'en'self::ENGLISH_LANGUAGE_CODE, bool $remove_unsupported_chars = true, bool $replace_extra_symbols = false, bool $use_transliterate = false, ?bool $replace_single_chars_only = NULL) : string
 

Returns an ASCII version of the string. A set of non-ASCII characters are replaced with their closest ASCII counterparts, and the rest are removed by default. The language or locale of the source string can be supplied for language-specific transliteration in any of the following formats: en, en_GB, or en-GB. For example, passing "de" results in "äöü" mapping to "aeoeue" rather than "aou" as in other languages.

EXAMPLE: ASCII::to_ascii('�Düsseldorf�', 'en'); // Dusseldorf

  • param string $str
  • param string $language [optional] Language of the source string. (default is 'en') | ASCII::*_LANGUAGE_CODE
  • param bool $remove_unsupported_chars [optional] Whether or not to remove the unsupported characters.
  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • param bool $use_transliterate [optional] Use ASCII::to_transliterate() for unknown chars.
  • param bool|null $replace_single_chars_only [optional] Single char replacement is better for the performance, but some languages need to replace more then one char at the same time. | NULL === auto-setting, depended on the language
  • psalm-pure
  • return string
  • phpstan-param ASCII::*_LANGUAGE_CODE $language
public static to_ascii_remap(string $str1, string $str2) : array
 

WARNING: This method will return broken characters and is only for special cases.

Convert two UTF-8 encoded string to a single-byte strings suitable for functions that need the same string length after the conversion.

The function simply uses (and updates) a tailored dynamic encoding (in/out map parameter) where non-ascii characters are remapped to the range [128-255] in order of appearance.

  • param string $str1
  • param string $str2
  • return string[]
  • phpstan-return array{0: string, 1: string}
private static to_ascii_remap_intern(string $str, array $map) : string
 

WARNING: This method will return broken characters and is only for special cases.

Convert a UTF-8 encoded string to a single-byte string suitable for functions that need the same string length after the conversion.

The function simply uses (and updates) a tailored dynamic encoding (in/out map parameter) where non-ascii characters are remapped to the range [128-255] in order of appearance.

Thus, it supports up to 128 different multibyte code points max over the whole set of strings sharing this encoding.

Source: https://github.com/KEINOS/mb_levenshtein

  • param string $str
  • param array $map
  • return string
  • phpstan-param array<string, string> $map
public static to_filename(string $str, bool $use_transliterate = true, string $fallback_char = '-') : string
 

Convert given string to safe filename (and keep string case).

EXAMPLE: ASCII::to_filename('שדגשדג.png', true)); // 'shdgshdg.png'

  • param string $str
  • param bool $use_transliterate
  • param string $fallback_char
  • psalm-pure
  • return string
public static to_slugify(string $str, string $separator = '-', string $language = 'en'self::ENGLISH_LANGUAGE_CODE, array $replacements = [], bool $replace_extra_symbols = false, bool $use_str_to_lower = true, bool $use_transliterate = false) : string
 

Converts the string into an URL slug. This includes replacing non-ASCII characters with their closest ASCII equivalents, removing remaining non-ASCII and non-alphanumeric characters, and replacing whitespace with $separator. The separator defaults to a single dash, and the string is also converted to lowercase. The language of the source string can also be supplied for language-specific transliteration.

  • param string $str
  • param string $separator [optional] The string used to replace whitespace.
  • param string $language [optional] Language of the source string. (default is 'en') | ASCII::*_LANGUAGE_CODE
  • param array<string,string> $replacements [optional] A map of replaceable strings.
  • param bool $replace_extra_symbols [optional] Add some more replacements e.g. "£" with " pound ".
  • param bool $use_str_to_lower [optional] Use "string to lower" for the input.
  • param bool $use_transliterate [optional] Use ASCII::to_transliterate() for unknown chars.
  • psalm-pure
  • return string
  • phpstan-param ASCII::*_LANGUAGE_CODE $language
public static to_transliterate(string $str, $unknown = '?', bool $strict = false) : string
 

Returns an ASCII version of the string. A set of non-ASCII characters are replaced with their closest ASCII counterparts, and the rest are removed unless instructed otherwise.

EXAMPLE: ASCII::to_transliterate('déjà σσς iıii'); // 'deja sss iiii'

  • param string $str
  • param string|null $unknown [optional] Character use if character unknown. (default is '?') But you can also use NULL to keep the unknown chars.
  • param bool $strict [optional] Use "transliterator_transliterate()" from PHP-Intl
  • psalm-pure
  • return string
  • noinspection ParameterDefaultValueIsNotNullInspection
© 2024 Bruce Wells
Search Namespaces \ Classes
Configuration