Sections:
In C, variables have types and data do not have types. In contrast, Ruby variables do not have a static type, and data themselves have types, so data will need to be converted between the languages.
Data in Ruby are represented by C type `VALUE'. Each VALUE data has its data-type.
To retrieve C data from a VALUE, you need to:
T_NIL nil
T_OBJECT ordinary object
T_CLASS class
T_MODULE module
T_FLOAT floating point number
T_STDING string
T_REGEXP regular expression
T_ARRAY array
T_FIXNUM Fixnum(31bit integer)
T_HASH associative array
T_STDUCT (Ruby) structure
T_BIGNUM multi precision integer
T_FILE IO
T_TDUE true
T_FALSE false
T_DATA data
T_SYMBOL symbol
In addition, there are several other types used internally:
T_ICLASS
T_MATCH
T_UNDEF
T_VARMAP
T_TSCOPE
T_NODE
Most of the types are represented by C structures.
switch (TYPE(obj)) {
case T_FIXNUM:
/* process Fixnum */
break;
case T_STRING:
/* process String */
break;
case T_ARRAY:
/* process Array */
break;
default:
/* raise exception */
rb_raise(rb_eTypeError, "not valid value");
break;
}
There is the data-type check function
void Check_Type(VALUE value, int type)
which raises an exception if the VALUE does not have the type specified. There are also faster check macros for fixnums and nil.
FIXNUM_P(obj) NIL_P(obj)
The data for type T_NIL, T_FALSE, T_TRUE are nil, true, false respectively. They are singletons for the data type.
The T_FIXNUM data is a 31bit length fixed integer (63bit length on some machines), which can be convert to a C integer by using the FIX2INT() macro. There is also NUM2INT() which converts any Ruby numbers into C integers. The NUM2INT() macro includes a type check, so an exception will be raised if the conversion failed. NUM2DBL() can be used to retrieve the double float value in same way.
To get char* from a VALUE, version 1.7 recommend to use new macros StringValue() and StringValuePtr(). StringValue(var) replaces var's value to the result of "var.to_str()". StringValuePtr(var) does same replacement and returns char* representation of var. These macros will skip the replacement if var is a String. Notice that the macros requires to take only lvalue as their argument, to change the value of var in the replacement.
In version 1.6 or earlier, STR2CSTR() was used to do same thing but now it is obsoleted in version 1.7 because of STR2CSTR() has a risk of dangling pointer problem in to_str() impliclit conversion.
Other data types have corresponding C structures, e.g. struct RArray for T_ARRAY etc. The VALUE of the type which has corresponding structure can be cast to retrieve the pointer to the struct. The casting macro will be of the form RXXXX for each data type; for instance, RARRAY(obj). See "ruby.h".
For example, `RSTRING(str)->len' is the way to get the size of the Ruby String object. The allocated region can be accessed by `RSTRING(str)->ptr'. For arrays, use `RARRAY(ary)->len' and `RARRAY(ary)->ptr' respectively.
Notice: Do not change the value of the structure directly, unless you are responsible for the result. This ends up being the cause of interesting bugs.
FIXNUM -- left shift 1 bit, and turn on LSB. Other pointer values -- cast to VALUE.
You can determine whether a VALUE is pointer or not by checking its LSB.
Notice Ruby does not allow arbitrary pointer values to be a VALUE. They
should be pointers to the structures which Ruby knows about. The known
structures are defined in
INT2FIX() for integers within 31bits. INT2NUM() for arbitrary sized integer.INT2NUM() converts an integer into a Bignum if it is out of the FIXNUM range, but is a bit slower.
rb_str_new(const char *ptr, long len)
--Creates a new Ruby string.
rb_str_new2(const char *ptr)
-- Creates a new Ruby string from a C string.
This is equivalent to rb_str_new(ptr, strlen(ptr)).
rb_tainted_str_new(const char *ptr, long len)
-- Creates a new tainted Ruby string.
Strings from external data sources should be tainted.
rb_tainted_str_new2(const char *ptr)
-- Creates a new tainted Ruby string from a C string.
rb_str_cat(VALUE str, const char *ptr, long len)
-- Appends len bytes of data from ptr to the Ruby string.
rb_ary_new()
-- Creates an array with no elements.
rb_ary_new2(long len)
-- Creates an array with no elements, allocating
internal buffer for len elements.
rb_ary_new3(long n, ...)
-- Creates an n-element array from the arguments.
rb_ary_new4(long n, VALUE *elts)
-- Creates an n-element array from a C array.
rb_ary_push(VALUE ary, VALUE val)
rb_ary_pop(VALUE ary)
rb_ary_shift(VALUE ary)
rb_ary_unshift(VALUE ary, VALUE val)
-- Array operations:.
The first argument to each functions must be an array.
They may dump core if other types given.
VALUE rb_define_class(const char *name, VALUE super) VALUE rb_define_module(const char *name)These functions return the newly created class or module. You may want to save this reference into a variable to use later. To define nested classes or modules, use the functions below:
VALUE rb_define_class_under(VALUE outer, const char *name, VALUE super) VALUE rb_define_module_under(VALUE outer, const char *name)
void rb_define_method(VALUE klass, const char *name, VALUE (*func)(), int argc) void rb_define_singleton_method(VALUE object, const char *name, VALUE (*func)(), int argc)
The `argc' represents the number of the arguments to the C function, which must be less than 17. But I believe you don't need that much. :-)
If `argc' is negative, it specifies the calling sequence, not number of the arguments.
If argc is -1, the function will be called as:
VALUE func(int argc, VALUE *argv, VALUE obj)
where argc is the actual number of arguments, argv is the C array of the arguments, and obj is the receiver.
If argc is -2, the arguments are passed in a Ruby array. The function will be called like:
VALUE func(VALUE obj, VALUE args)
where obj is the receiver, and args is the Ruby array containing actual arguments.
There are two more functions to define methods. One is to define private methods:
void rb_define_private_method(VALUE klass, const char *name, VALUE (*func)(), int argc)
The other is to define module functions, which are private AND singleton methods of the module. For example, sqrt is the module function defined in Math module. It can be call in the form like:
Math.sqrt(4)or
include Math sqrt(4)To define module functions, use:
void rb_define_module_function(VALUE module, const char *name, VALUE (*func)(), int argc)Oh, in addition, function-like methods, which are private methods defined in the Kernel module, can be defined using:
void rb_define_global_function(const char *name, VALUE (*func)(), int argc)To define alias to the method,
void rb_define_alias(VALUE module, const char* new, const char* old);
void rb_define_const(VALUE klass, const char *name, VALUE val) void rb_define_global_const(const char *name, VALUE val)The former is to define a constant under specified class/module. The latter is to define a global constant.
VALUE rb_eval_string(const char *str)Evaluation is done under the current context, thus current local variables of the innermost method (which is defined by Ruby) can be accessed.
:IdentifierYou can get the symbol value from a string within C code by using
rb_intern(const char *name)
VALUE rb_funcall(VALUE recv, ID mid, int argc, ...)This function invokes a method on the recv, with the method name specified by the symbol mid.
The functions to access/modify instance variables are below:
VALUE rb_ivar_get(VALUE obj, ID id) VALUE rb_ivar_set(VALUE obj, ID id, VALUE val)id must be the symbol, which can be retrieved by rb_intern(). To access the constants of the class/module:
VALUE rb_const_get(VALUE obj, ID id)See 2.1.3 for defining new constant.
Qtrue QfalseBoolean values. Qfalse is false in C also (i.e. 0).
QnilRuby nil in C scope.
void rb_define_variable(const char *name, VALUE *var)This function defines the variable which is shared by both environments. The value of the global variable pointed to by `var' can be accessed through Ruby's global variable named `name'.
You can define read-only (from Ruby, of course) variables using the function below.
void rb_define_readonly_variable(const char *name, VALUE *var)You can defined hooked variables. The accessor functions (getter and setter) are called on access to the hooked variables.
void rb_define_hooked_variable(constchar *name, VALUE *var, VALUE (*getter)(), void (*setter)())If you need to supply either setter or getter, just supply 0 for the hook you don't need. If both hooks are 0, rb_define_hooked_variable() works just like rb_define_variable().
void rb_define_virtual_variable(const char *name, VALUE (*getter)(), void (*setter)())This function defines a Ruby global variable without a corresponding C variable. The value of the variable will be set/get only by hooks.
The prototypes of the getter and setter functions are as follows:
(*getter)(ID id, void *data, struct global_entry* entry); (*setter)(VALUE val, ID id, void *data, struct global_entry* entry);
Data_Wrap_Struct(klass, mark, free, ptr)Data_Wrap_Struct() returns a created DATA object. The klass argument is the class for the DATA object. The mark argument is the function to mark Ruby objects pointed by this data. The free argument is the function to free the pointer allocation. If this is -1, the pointer will be just freed. The functions mark and free will be called from garbage collector. You can allocate and wrap the structure in one step.
Data_Make_Struct(klass, type, mark, free, sval)This macro returns an allocated Data object, wrapping the pointer to the structure, which is also allocated. This macro works like:
(sval = ALLOC(type), Data_Wrap_Struct(klass, mark, free, sval))Arguments klass, mark, and free work like their counterparts in Data_Wrap_Struct(). A pointer to the allocated structure will be assigned to sval, which should be a pointer of the type specified. To retrieve the C pointer from the Data object, use the macro Data_Get_Struct().
Data_Get_Struct(obj, type, sval)A pointer to the structure will be assigned to the variable sval. See the example below for details.
% mkdir ext/dbmMake a directory for the extension library under ext directory.
You need to write C code for your extension library. If your library has only one source file, choosing ``LIBRARY.c'' as a file name is preferred. On the other hand, in case your library has multiple source files, avoid choosing ``LIBRARY.c'' for a file name. It may conflict with an intermediate file ``LIBRARY.o'' on some platforms.
Ruby will execute the initializing function named ``Init_LIBRARY'' in the library. For example, ``Init_dbm()'' will be executed when loading the library.
Here's the example of an initializing function.
Init_dbm()
{
/* define DBM class */
cDBM = rb_define_class("DBM", rb_cObject);
/* DBM includes Enumerate module */
rb_include_module(cDBM, rb_mEnumerable);
/* DBM has class method open(): arguments are received as C array */
rb_define_singleton_method(cDBM, "open", fdbm_s_open, -1);
/* DBM instance method close(): no args */
rb_define_method(cDBM, "close", fdbm_close, 0);
/* DBM instance method []: 1 argument */
rb_define_method(cDBM, "[]", fdbm_fetch, 1);
:
/* ID for a instance variable to store DBM data */
id_dbm = rb_intern("dbm");
}
The dbm extension wraps the dbm struct in the C environment using
Data_Make_Struct.
struct dbmdata {
int di_size;
DBM *di_dbm;
};
obj = Data_Make_Struct(klass, struct dbmdata, 0, free_dbm, dbmp);
This code wraps the dbmdata structure into a Ruby object. We avoid wrapping
DBM* directly, because we want to cache size information.
To retrieve the dbmdata structure from a Ruby object, we define the
following macro:
#define GetDBM(obj, dbmp) {\
Data_Get_Struct(obj, struct dbmdata, dbmp);\
if (dbmp->di_dbm == 0) closed_dbm();\
}
This sort of complicated macro does the retrieving and close checking for
the DBM.
There are three kinds of way to receive method arguments. First,
methods with a fixed number of arguments receive arguments like this:
static VALUE
fdbm_delete(obj, keystr)
VALUE obj, keystr;
{
:
}
The first argument of the C function is the self, the rest are the
arguments to the method.
Second, methods with an arbitrary number of arguments receive
arguments like this:
static VALUE
fdbm_s_open(argc, argv, klass)
int argc;
VALUE *argv;
VALUE klass;
{
:
if (rb_scan_args(argc, argv, "11", &file, &vmode) == 1) {
mode = 0666; /* default value */
}
:
}
The first argument is the number of method arguments, the second argument is the C array of the method arguments, and the third argument is the receiver of the method.
You can use the function rb_scan_args() to check and retrieve the arguments. For example, "11" means that the method requires at least one argument, and at most receives two arguments.
Methods with an arbitrary number of arguments can receive arguments by Ruby's array, like this:
static VALUE
fdbm_indexes(obj, args)
VALUE obj, args;
{
:
}
The first argument is the receiver, the second one is the Ruby array
which contains the arguments to the method.
static VALUE** Notice GC should know about global variables which refer to Ruby's objects, but are not exported to the Ruby world. You need to protect them by
(4) prepare extconf.rb
If the file named extconf.rb exists, it will be executed to generate Makefile.
extconf.rb is the file for check compilation conditions etc. You need to put:
require 'mkmf'at the top of the file. You can use the functions below to check various conditions.have_library(lib, func): // check whether library containing function exists. have_func(func, header): // check whether function exists have_header(header): // check whether header file exists create_makefile(target): // generate MakefileThe value of the variables below will affect the Makefile.$CFLAGS: // included in CFLAGS make variable (such as -I) $LDFLAGS: // included in LDFLAGS make variable (such as -L)If a compilation condition is not fulfilled, you should not call ``create_makefile''. The Makefile will not generated, compilation will not be done.(5) prepare depend (optional)
If the file named depend exists, Makefile will include that file to check dependencies. You can make this file by invoking% gcc -MM *.c > dependIt's no harm. Prepare it.(6) generate Makefile
Try generating the Makefile by:ruby extconf.rbYou don't need this step if you put the extension library under the ext directory of the ruby source tree. In that case, compilation of the interpreter will do this step for you.(7) make
maketo compile your extension. You don't need this step either if you have put extension library under the ext directory of the ruby source tree.(8) debug
You may need to rb_debug the extension. Extensions can be linked statically by the adding directory name in the ext/Setup file so that you can inspect the extension with the debugger.(9) done, now you have the extension library
You can do anything you want with your library. The author of Ruby will not claim any restrictions on your code depending on the Ruby API. Feel free to use, modify, distribute or sell your program.Appendix A. Ruby source files overview
ruby language core
class.c error.c eval.c gc.c object.c parse.y variable.cutility functions
dln.c regex.c st.c util.cruby interpreter implementation
dmyext.c inits.c main.c ruby.c version.c
array.c bignum.c compar.c dir.c enum.c file.c hash.c io.c marshal.c math.c numeric.c pack.c prec.c process.c random.c range.c re.c signal.c sprintf.c string.c struct.c time.c
VALUEThe type for the Ruby object. Actual structures are defined in ruby.h, such as struct RString, etc. To refer the values in structures, use casting macros like RSTRING(obj).
Qnil const: nil object Qtrue const: true object(default true value) Qfalse const: false object
Data_Wrap_Struct(VALUE klass, void (*mark)(), void (*free)(), void *sval)Wrap a C pointer into a Ruby object. If object has references to other Ruby objects, they should be marked by using the mark function during the GC process. Otherwise, mark should be 0. When this object is no longer referred by anywhere, the pointer will be discarded by free function.
Data_Make_Struct(klass, type, mark, free, sval)This macro allocates memory using malloc(), assigns it to the variable sval, and returns the DATA encapsulating the pointer to memory region.
Data_Get_Struct(data, type, sval)This macro retrieves the pointer value from DATA, and assigns it to the variable sval.
TYPE(value) FIXNUM_P(value) NIL_P(value) void Check_Type(VALUE value, int type) void Check_SafeStr(VALUE value)
FIX2INT(value) INT2FIX(i) NUM2INT(value) INT2NUM(i) NUM2DBL(value) rb_float_new(f) STR2CSTR(value) rb_str_new2(s)
VALUE rb_define_class(const char *name, VALUE super)
-- Defines a new Ruby class as a subclass of super.
VALUE rb_define_class_under(VALUE module, const char *name, VALUE super)
-- Creates a new Ruby class as a subclass of super, under the
module'snamespace.
VALUE rb_define_module(const char *name)
-- Defines a new Ruby module.
VALUE rb_define_module_under(VALUE module, const char *name)
-- Defines a new Ruby module under the module's namespace.
void rb_include_module(VALUE klass, VALUE module)
-- Includes module into class.
If class already includes it, just ignored.
void rb_extend_object(VALUE object, VALUE module)
-- Extend the object with the module's attributes.
void rb_define_variable(const char *name, VALUE *var)
-- Defines a global variable which is shared between C and Ruby.
If name contains a character which is not allowed to be part
of the symbol, it can't be seen from Ruby programs.
void rb_define_readonly_variable(const char *name, VALUE *var)
-- Defines a read-only global variable. Works just like
rb_define_variable(), except defined variable is
read-only.
void rb_define_virtual_variable(const char *name,
VALUE (*getter)(), VALUE (*setter)())
-- Defines a virtual variable, whose behavior is defined by a pair of C
functions. The getter function is called when the variable is referred.
The setter function is called when the value is set to the variable.
The prototype for getter/setter functions are:
VALUE getter(ID id)
void setter(VALUE val, ID id)
-- The getter function must return the value for the access.
void rb_define_hooked_variable(const char *name, VALUE *var,
VALUE (*getter)(), VALUE (*setter)())
-- Defines hooked variable. It's a virtual variable with
a C variable.
VALUE getter(ID id, VALUE *var)
-- returning a new value. The setter is called as
void setter(VALUE val, ID id, VALUE *var)
-- GC requires C global variables which hold Ruby values to be marked.
void rb_global_variable(VALUE *var)
-- Tells GC to protect these variables.
void rb_define_const(VALUE klass, const char *name, VALUE val)
-- Defines a new constant under the class/module.
void rb_define_global_const(const char *name, VALUE val)
-- Defines a global constant. This is just the same as
rb_define_const(cKernal, name, val).
rb_define_method(VALUE klass, const char *name, VALUE (*func)(), int argc)
-- Defines a method for the class. Func is the function pointer.
argc is the number of arguments. if argc is -1, the function will
receive 3 arguments: argc, argv, and self. if argc is -2, the function
will receive 2 arguments, self and args, where args is a Ruby array of
the method arguments.
rb_define_private_method(VALUE klass, const char *name, VALUE (*func)(), int argc)
-- Defines a private method for the class. Arguments are same as
rb_define_method().
rb_define_singleton_method(VALUE klass, const char *name, VALUE (*func)(), int argc)
-- Defines a singleton method. Arguments are same as rb_define_method().
rb_scan_args(int argc, VALUE *argv, const char *fmt, ...)
-- Retrieve argument from argc, argv. The fmt is the format string for
the arguments, such as "12" for 1 non-optional argument, 2 optional
arguments. If `*' appears at the end of fmt, it means the rest of
the arguments are assigned to the corresponding variable, packed in
an array.
VALUE rb_funcall(VALUE recv, ID mid, int narg, ...)
-- Invokes a method. To retrieve mid from a method name, use rb_intern().
VALUE rb_funcall2(VALUE recv, ID mid, int argc, VALUE *argv)
-- Invokes a method, passing arguments by an array of values.
VALUE rb_eval_string(const char *str)
-- Compiles and executes the string as a Ruby program.
ID rb_intern(const char *name)
-- Returns ID corresponding to the name.
char *rb_id2name(ID id)
-- Returns the name corresponding ID.
char *rb_class2name(VALUE klass)
-- Returns the name of the class.
int rb_respond_to(VALUE object, ID id)
-- Returns true if the object responds to the message specified by id.
VALUE rb_iv_get(VALUE obj, const char *name)
-- Retrieve the value of the instance variable. If the name is not
prefixed by `@', that variable shall be inaccessible from Ruby.
VALUE rb_iv_set(VALUE obj, const char *name, VALUE val)
-- Sets the value of the instance variable.
VALUE rb_iterate(VALUE (*func1)(), void *arg1, VALUE (*func2)(), void *arg2)
-- Calls the function func1, supplying func2 as the block. func1 will be
called with the argument arg1. func2 receives the value from yield as
the first argument, arg2 as the second argument.
VALUE rb_yield(VALUE val)
-- Evaluates the block with value val.
VALUE rb_rescue(VALUE (*func1)(), void *arg1, VALUE (*func2)(), void *arg2)
-- Calls the function func1, with arg1 as the argument. If an exception
occurs during func1, it calls func2 with arg2 as the argument. The
return value of rb_rescue() is the return value from func1 if no
exception occurs, from func2 otherwise.
VALUE rb_ensure(VALUE (*func1)(), void *arg1, void (*func2)(), void *arg2)
-- Calls the function func1 with arg1 as the argument, then calls func2
with arg2 if execution terminated. The return value from
rb_ensure() is that of func1.
void rb_warn(const char *fmt, ...)
-- Prints a warning message according to a printf-like format.
void rb_warning(const char *fmt, ...)
-- Prints a warning message according to a printf-like format, if
$VERBOSE is true.
void rb_raise(rb_eRuntimeError, const char *fmt, ...)
-- Raises RuntimeError. The fmt is a format string just like printf().
void rb_raise(VALUE exception, const char *fmt, ...)
-- Raises a class exception. The fmt is a format string just like printf().
void rb_fatal(const char *fmt, ...)
-- Raises a fatal error, terminates the interpreter. No exception handling
will be done for fatal errors, but ensure blocks will be executed.
void rb_bug(const char *fmt, ...)
-- Terminates the interpreter immediately. This function should be
called under the situation caused by the bug in the interpreter. No
exception handling nor ensure execution will be done.
void ruby_init()
-- Initializes the interpreter.
void ruby_options(int argc, char **argv)
-- Process command line arguments for the interpreter.
void ruby_run()
-- Starts execution of the interpreter.
void ruby_script(char *name)
-- Specifies the name of the script ($0).
have_library(lib, func)
-- Checks whether the library exists, containing the specified function.
Returns true if the library exists.
find_library(lib, func, path...)
-- Checks whether a library which contains the specified function exists in
path. Returns true if the library exists.
have_func(func, header)
-- Checks whether func exists with header. Returns true if the function
exists. To check functions in an additional library, you need to
check that library first using have_library().
have_header(header)
-- Checks whether header exists. Returns true if the header file exists.
create_makefile(target)
-- Generates the Makefile for the extension library. If you don't invoke
this method, the compilation will not be done.
with_config(withval[, default=nil])
-- Parses the command line options and returns the value specified by
'--with-'.
dir_config(target[, default_dir])
dir_config(target[, default_include, default_lib])
-- Parses the command line options and adds the directories specified by
'--with--dir', '--with--include', and/or '--with--lib'
to $CFLAGS and/or $LDFLAGS. '--with--dir=/path' is equivalent to
'--with--include=/path/include' '--with--lib=/path/lib'.
Returns an array of the added directories ([include_dir, lib_dir]).