The constructor new_prt()
creates a prt
object from one or several
fst
files, making sure that each table consist of identically named,
ordered and typed columns. In order to create a prt
object from an
in-memory table, as_prt()
coerces objects inheriting from data.frame
to prt
by first splitting rows into n_chunks
, writing fst
files to the
directory dir
and calling new_prt()
on the resulting fst
files. If
this default splitting of rows (which might impact efficiency of subsequent
queries on the data) is not optimal, a list of objects inheriting from
data.frame
is a valid x
argument as well.
new_prt(files)
as_prt(x, n_chunks = NULL, dir = tempfile())
is_prt(x)
n_part(x)
part_nrow(x)
# S3 method for prt
head(x, n = 6L, ...)
# S3 method for prt
tail(x, n = 6L, ...)
# S3 method for prt
as.data.table(x, ...)
# S3 method for prt
as.list(x, ...)
# S3 method for prt
as.data.frame(x, row.names = NULL, optional = FALSE, ...)
# S3 method for prt
as.matrix(x, ...)
Character vector of file name(s).
A prt
object.
Count variable specifying the number of chunks x
is
split into.
Directory where the chunked fst::fst()
objects reside in.
Count variable indicating the number of rows to return.
Generic consistency: additional arguments are ignored and a warning is issued.
Generic consistency: passing anything other than the default value issues a warning.
To check whether an object inherits from prt
, the function is_prt()
is
exported, the number of partitions can be queried by calling n_part()
and
the number of rows per partition is available as part_nrow()
.
The base R
S3 generic functions dim()
, length()
, dimnames()
and
names()
,have prt
-specific implementations, where dim()
returns the
overall table dimensions, length()
is synonymous for ncol()
,
dimnames()
returns a length 2 list containing NULL
column names as
character vector and names()
is synonymous for colnames()
. Both setting
and getting row names on prt
objects is not supported and more generally,
calling replacement functions such as names<-()
or dimnames<-()
leads
to an error, as prt
objects are immutable. The base R
S3 generic
functions head()
and tail()
are available as well and are used
internally to provide an extensible mechanism for printing (see
format_dt()
).
Coercion to other base R
objects is possible via as.list()
,
as.data.frame()
and as.matrix()
and for coercion to data.table
, its
generic function data.table::as.data.table()
is available to prt
objects. All coercion involves reading the full data into memory at once
which might be problematic in cases of large data sets.
cars <- as_prt(mtcars, n_chunks = 2L)
is_prt(cars)
#> [1] TRUE
n_part(cars)
#> [1] 2
part_nrow(cars)
#> [1] 16 16
nrow(cars)
#> [1] 32
ncol(cars)
#> [1] 11
colnames(cars)
#> [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear"
#> [11] "carb"
names(cars)
#> [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear"
#> [11] "carb"
head(cars)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> 1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> 2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> 3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> 4: 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5: 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#> 6: 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
tail(cars, n = 2)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> 1: 15.0 8 301 335 3.54 3.57 14.6 0 1 5 8
#> 2: 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2
str(as.list(cars))
#> List of 11
#> $ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#> $ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
#> $ disp: num [1:32] 160 160 108 258 360 ...
#> $ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
#> $ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
#> $ wt : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
#> $ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
#> $ vs : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
#> $ am : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
#> $ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
#> $ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
str(as.data.frame(cars))
#> 'data.frame': 32 obs. of 11 variables:
#> $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#> $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
#> $ disp: num 160 160 108 258 360 ...
#> $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
#> $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
#> $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
#> $ qsec: num 16.5 17 18.6 19.4 17 ...
#> $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
#> $ am : num 1 1 1 0 0 0 0 0 0 0 ...
#> $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
#> $ carb: num 4 4 1 1 2 1 4 2 2 4 ...