cutext/source.h: Text Source with Optional Lookahead
[cutext: Unicode and Text Handling]

Data Structures

struct  cutext_source_descriptor
struct  cutext_source

Source API



enum  cutext_source_info_key_t { CUTEXT_SOURCE_INFO_ENCODING = 1, CUTEXT_SOURCE_INFO_TABSTOP = 3 }
typedef char const * cutext_source_info_encoding_t
cu_bool_t cutext_source_info_key_inherits (cutext_source_info_key_t key)
cutext_source_descriptor_t cutext_source_descriptor (cutext_source_t src)
void cutext_source_init (cutext_source_t src, cutext_source_descriptor_t descriptor)
size_t cutext_source_read (cutext_source_t src, void *buf, size_t max_size)
size_t cutext_source_skip (cutext_source_t src, size_t max_size)
cu_bool_t cutext_source_can_look (cutext_source_t src)
void const * cutext_source_look (cutext_source_t src, size_t size, size_t *size_out)
void cutext_source_close (cutext_source_t src)
cu_box_t cutext_source_info (cutext_source_t src, cutext_source_info_key_t key)
cu_box_t cutext_source_info_inherit (cutext_source_t src, cutext_source_info_key_t key, cutext_source_t subsrc)
char const * cutext_source_encoding (cutext_source_t src)

Generic Callbacks



size_t cutext_source_null_read (cutext_source_t, void *, size_t)
void cutext_source_noop_close (cutext_source_t)
cutext_source_t cutext_source_no_subsource (cutext_source_t)
cu_box_t cutext_source_default_info (cutext_source_t, cutext_source_info_key_t)

Source Implementations



cutext_source_t cutext_source_new_mem (char const *enc, void const *data, size_t size)
cutext_source_t cutext_source_new_cstr (char const *cstr)
cutext_source_t cutext_source_new_str (cu_str_t str)
cutext_source_t cutext_source_new_wstring (cu_wstring_t wstr)
cutext_source_t cutext_source_fdopen (char const *enc, int fd, cu_bool_t close_fd)
cutext_source_t cutext_source_fopen (char const *encoding, char const *path)
cutext_source_t cutext_source_stack_buffer (cutext_source_t subsrc)
cutext_source_t cutext_source_stack_iconv (char const *newenc, cutext_source_t subsrc)

Algorithms



size_t cutext_source_count (cutext_source_t src)
cutext_encoding_t cutext_source_guess_encoding (cutext_source_t src)

Detailed Description

This provides the API and some implementations of sources with optional lookahead, intended for scanning text. These sources form the initial source and stack of conversions used by the more specialized lexical sources.

Initial sources are provided for strings and files. These can be altered with the cutext_source_stack_* functions. In particular,

See also:
cutext/lsource.h: Wide Character Source for Lexical Analysis
cutext/sink.h: Text Sink

Typedef Documentation

typedef char const* cutext_source_info_encoding_t

The type of the CUTEXT_SOURCE_INFO_ENCODING property. This trivial definition is provided to help keep type casts correct for boxing and unboxing operations.


Function Documentation

cu_bool_t cutext_source_can_look ( cutext_source_t  src  ) 

True iff src provides the cutext_source_look method.

void cutext_source_close ( cutext_source_t  src  ) 

Close src, which in turn should close its subsources.

size_t cutext_source_count ( cutext_source_t  src  ) 

Drain src and return the number of bytes which were left.

cu_box_t cutext_source_default_info ( cutext_source_t  ,
cutext_source_info_key_t   
)

Returns NULL for encoding, and 8 for tabstop.

char const* cutext_source_encoding ( cutext_source_t  src  ) 

The name of the character encoding of src, or NULL if unknown.

cutext_source_t cutext_source_fdopen ( char const *  enc,
int  fd,
cu_bool_t  close_fd 
)

Return a source over the contents read from fd encoded as enc. If close_fd is true, then close fd when cutext_source_close is called on the returned source.

cutext_source_t cutext_source_fopen ( char const *  encoding,
char const *  path 
)

Return a source over the contents of the file at path encoded as encoding.

cutext_encoding_t cutext_source_guess_encoding ( cutext_source_t  src  ) 

Try to guess which Unicode encoding is used in src. This requires that src supports lookahead.

cu_box_t cutext_source_info_inherit ( cutext_source_t  src,
cutext_source_info_key_t  key,
cutext_source_t  subsrc 
)

Assist the source implementation src with a subsource subsrc in providing a suitable default value for key.

void cutext_source_init ( cutext_source_t  src,
cutext_source_descriptor_t  descriptor 
)

Initialize the base src of a source implementation with callbacks provided by descriptor.

void const* cutext_source_look ( cutext_source_t  src,
size_t  size,
size_t *  size_out 
)

Request a lookahead of size bytes of upcoming data from src. If successful the data is returned and its actual size is assigned to *size_out. The actual size may be larger than size, and only on the end-of-stream or in case of error may it be smaller.

This method is only provided by some source implementations, as reported by cutext_source_can_look. If needed, cutext_source_stack_buffer stacks a buffer onto any source to provide lookahead.

cutext_source_t cutext_source_new_cstr ( char const *  cstr  ) 

Return a source over the 0-terminated UTF-8 string cstr.

cutext_source_t cutext_source_new_mem ( char const *  enc,
void const *  data,
size_t  size 
)

Return a source over a size bytes of data stored from data and considered to be encoded as enc.

cutext_source_t cutext_source_new_str ( cu_str_t  str  ) 

Return a source over the UTF-8 string str.

cutext_source_t cutext_source_new_wstring ( cu_wstring_t  wstr  ) 

Return a source over the wide string wstr.

cutext_source_t cutext_source_no_subsource ( cutext_source_t   ) 

Returns NULL.

void cutext_source_noop_close ( cutext_source_t   ) 

The trivial close operation.

size_t cutext_source_null_read ( cutext_source_t  ,
void *  ,
size_t   
)

Always returns 0.

size_t cutext_source_read ( cutext_source_t  src,
void *  buf,
size_t  max_size 
)

Read at most max_size bytes into buf, returning the actual number read or (size_t)-1 on error. A return of 0 indicates the end of the stream, any other successful call reads at least one byte.

size_t cutext_source_skip ( cutext_source_t  src,
size_t  max_size 
)

Skips over at most max_size bytes, returning the actual number skipped, or (size_t)-1 on error. A return of 0 indicates the end of the stream. If a previous call to cutext_source_look has returned a size of max_size or larger since the last read or skip, then this call succeeds with the full count.

cutext_source_t cutext_source_stack_buffer ( cutext_source_t  subsrc  ) 

Stack a buffer on top of subsrc to provide lookahead. The cutext_source_look method is guaranteed to be available on the returned source.

cutext_source_t cutext_source_stack_iconv ( char const *  newenc,
cutext_source_t  subsrc 
)

Stack an iconv conversion filter on subsrc, encoding to newenc.

Generated 2009-11-23 for culibs-0.25 using Doxygen. Maintained by Petter Urkedal.