Biber::Utils

Section: User Contributed Perl Documentation (3)
Updated: 2020-07-27
Page Index
 

NAME

Biber::Utils - Various utility subs used in Biber  

EXPORT

All functions are exported by default.  

FUNCTIONS

 

glob_data_file

  Expands a data file glob to a list of filenames

 

locate_data_file

  Searches for a data file by

  The exact path if the filename is absolute
  In the input_directory, if defined
  In the output_directory, if defined
  Relative to the current directory
  In the same directory as the control file
  Using kpsewhich, if available

 

  Check existence of NFC/NFD file variants and return correct one.
  Account for windows file encodings

 

biber_warn

    Wrapper around various warnings bits and pieces
    Logs a warning, add warning to the list of .bbl warnings and optionally
    increments warning count in Biber object, if present

 

biber_error

    Wrapper around error logging
    Forces an exit.

 

makenamesid

Given a Biber::Names object, return an underscore normalised concatenation of all of the full name strings.  

makenameid

Given a Biber::Name object, return an underscore normalised concatenation of the full name strings.  

latex_recode_output

  Tries to convert UTF-8 to TeX macros in passed string

 

strip_noinit

  Removes elements which are not to be considered during initials generation
  in names

 

strip_nosort

  Removes elements which are not to be used in sorting a name from a string

 

normalise_string_label

Remove some things from a string for label generation. Don't strip \p{Dash} as this is needed to process compound names or label generation.  

normalise_string_sort

Removes LaTeX macros, and all punctuation, symbols, separators as well as leading and trailing whitespace for sorting strings. Control chars don't need to be stripped as they are completely ignorable in DUCET  

normalise_string_bblxml

Some string normalisation for bblxml output  

normalise_string

Removes LaTeX macros, and all punctuation, symbols, separators and control characters, as well as leading and trailing whitespace for sorting strings. Only decodes LaTeX character macros into Unicode if output is UTF-8  

normalise_string_common

  Common bit for normalisation

 

normalise_string_hash

  Normalise strings used for hashes. We collapse LaTeX macros into a vestige
  so that hashes are unique between things like:

  Smith
  {\v S}mith

  we replace macros like this to preserve their vestiges:

  \v S -> v:
  \" -> 34:

 

normalise_string_underscore

  Like normalise_string, but also substitutes ~ and whitespace with underscore.

 

escape_label

  Escapes a few special character which might be used in labels

 

unescape_label

  Unscapes a few special character which might be used in label but which need
  sorting without escapes

 

reduce_array

reduce_array(\@a, \@b) returns all elements in @a that are not in @b  

remove_outer

    Remove surrounding curly brackets:
        '{string}' -> 'string'
    but not
        '{string} {string}' -> 'string} {string'

    Return (boolean if stripped, string)

 

has_outer

    Return (boolean if surrounded in braces

 

add_outer

    Add surrounding curly brackets:
        'string' -> '{string}'

 

ucinit

    upper case of initial letters in a string

 

is_undef

    Checks for undefness of arbitrary things, including
    composite method chain calls which don't reliably work
    with defined() (see perldoc for defined())
    This works because we are just testing the value passed
    to this sub. So, for example, this is randomly unreliable
    even if the resulting value of the arg to defined() is "undef":

    defined($thing->method($arg)->method)

    wheras:

    is_undef($thing->method($arg)->method)

    works since we only test the return value of all the methods
    with defined()

 

is_def

    Checks for definedness in the same way as is_undef()

 

is_undef_or_null

    Checks for undef or nullness (see is_undef() above)

 

is_def_and_notnull

    Checks for def and unnullness (see is_undef() above)

 

is_def_and_null

    Checks for def and nullness (see is_undef() above)

 

is_null

    Checks for nullness

 

is_notnull

    Checks for notnullness

 

is_notnull_scalar

    Checks for notnullness of a scalar

 

is_notnull_array

    Checks for notnullness of an array (passed by ref)

 

is_notnull_hash

    Checks for notnullness of an hash (passed by ref)

 

is_notnull_object

    Checks for notnullness of an object (passed by ref)

 

stringify_hash

    Turns a hash into a string of keys and values

 

normalise_utf8

  Normalise any UTF-8 encoding string immediately to exactly what we want
  We want the strict perl utf8 "UTF-8"

 

inits

   We turn the initials into an array so we can be flexible with them later
   The tie here is used only so we know what to split on. We don't want to make
   any typesetting decisions in Biber, like what to use to join initials so on
   output to the .bbl, we only use BibLaTeX macros.

 

join_name

  Replace all join typsetting elements in a name part (space, ties) with BibLaTeX macros
  so that typesetting decisions are made in BibLaTeX, not hard-coded in Biber

 

filter_entry_options

    Process any per_entry option transformations which are necessary on output

 

imatch

    Do an interpolating (neg)match using a match RE and a string passed in as variables
    Using /g on matches so that $1,$2 etc. can be populated from repeated matches of
    same capture group as well as different groups

 

ireplace

    Do an interpolating match/replace using a match RE, replacement RE
    and string passed in as variables

 

validate_biber_xml

  Validate a biber/biblatex XML metadata file against an RNG XML schema

 

map_boolean

    Convert booleans between strings and numbers. Because standard XML "boolean"
    datatype considers "true" and "1" the same etc.

 

process_entry_options

    Set per-entry options

 

merge_entry_options

    Merge entry options, dealing with conflicts

 

expand_option_input

    Expand options such as meta-options coming from biblatex

 

parse_date_range

  Parse of ISO8601 date range
  Returns two-element array ref: [start DT object, end DT object]

 

parse_date_unspecified

  Parse of ISO8601-2:2016 4.3 unspecified format into date range
  Returns range plus specification of granularity of unspecified

 

parse_date_start

  Convenience wrapper

 

parse_date_end

  Convenience wrapper

 

parse_date

  Parse of EDTF dates

 

date_monthday

  Force month/day to ISO8601-2:2016 format with leading zero

 

biber_decode_utf8

    Perform NFD form conversion as well as UTF-8 conversion. Used to normalize
    bibtex input as the T::B interface doesn't allow a neat whole file slurping.

 

out

  Output to target. Outputs NFC UTF-8 if output is UTF-8

 

process_comment

  Fix up some problems with comments after being processed by btparse

 

locale2bcp47

  Map babel/polyglossia language options to a sensible CLDR (bcp47) locale default
  Return input string if there is no mapping

 

bcp472locale

  Map CLDR (bcp47) locale to a babel/polyglossia locale
  Return input string if there is no mapping

 

rangelen

  Calculate the length of a range field
  Range fields are an array ref of two-element array refs [range_start, range_end]
  range_end can be be empty for open-ended range or undef
  Deals with Unicode and ASCII roman numerals via the magic of Unicode NFKD form

  m-n -> [m, n]
  m   -> [m, undef]
  m-  -> [m, '']
  -n  -> ['', n]
  -   -> ['', undef]

 

match_indices

  Return array ref of array refs of matches and start indices of matches
  for provided array of compiled regexps into string

 

parse_range

  Parses a range of values into a two-value array ref.
  Ranges with no starting value default to "1"
  Ranges can be open-ended and it's up to surrounding code to interpret this
  Ranges can be single figures which is shorthand for 1-x

 

strip_annotation

  Removes annotation marker from a field name

 

parse_range_alt

  Parses a range of values into a two-value array ref.
  Either start or end can be undef and it's up to surrounding code to interpret this

 

maploopreplace

  Replace loop markers with values.

 

get_transliterator

  Get a ref to a transliterator for the given from/to
  We are abstracting this in this way because it is not clear what the future
  of the transliteration library is. We want to be able to switch.

 

call_transliterator

  Run a transliterator on passed text. Hides call semantics of transliterator
  so we can switch engine in the future.

 

AUTHOR

Philip Kime "<philip at kime.org.uk>"  

BUGS

Please report any bugs or feature requests on our Github tracker at <https://github.com/plk/biber/issues>.  

COPYRIGHT & LICENSE

Copyright 2012-2019 Philip Kime, all rights reserved.

This module is free software. You can redistribute it and/or modify it under the terms of the Artistic License 2.0.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.


 

Index

NAME
EXPORT
FUNCTIONS
glob_data_file
locate_data_file
biber_warn
biber_error
makenamesid
makenameid
latex_recode_output
strip_noinit
strip_nosort
normalise_string_label
normalise_string_sort
normalise_string_bblxml
normalise_string
normalise_string_common
normalise_string_hash
normalise_string_underscore
escape_label
unescape_label
reduce_array
remove_outer
has_outer
add_outer
ucinit
is_undef
is_def
is_undef_or_null
is_def_and_notnull
is_def_and_null
is_null
is_notnull
is_notnull_scalar
is_notnull_array
is_notnull_hash
is_notnull_object
stringify_hash
normalise_utf8
inits
join_name
filter_entry_options
imatch
ireplace
validate_biber_xml
map_boolean
process_entry_options
merge_entry_options
expand_option_input
parse_date_range
parse_date_unspecified
parse_date_start
parse_date_end
parse_date
date_monthday
biber_decode_utf8
out
process_comment
locale2bcp47
bcp472locale
rangelen
match_indices
parse_range
strip_annotation
parse_range_alt
maploopreplace
get_transliterator
call_transliterator
AUTHOR
BUGS
COPYRIGHT & LICENSE