Suppose I have a BytesMut object, and then keep writing data to its interior. Then split some frames out of it according to the frame segmentation format.
According to my idea, this memory is constantly reallocate out. So the question is, I can make its capacity smaller by constantly splitting, but at what point does it actually allocate contiguous memory to be freed?
If I keep taking data splits from the front, won't this memory usage get larger and larger? Or maybe I have a problem understanding that different BytesMut objects will have different native buffer when split, but how is this done?
#[test]
fn test_bytesmut_growth() {
use bytes::{BufMut, BytesMut};
let mut bm = BytesMut::with_capacity(16);
for i in 0..10000 {
bm.put(&b"1234567890"[..]);
let front = bm.split();
drop(front);
}
//println!("current cap={}, len={}", bm.capacity(), bm.len());
}
If you split a BytesMut object into two, the two objects will still share the underlying reference-counted buffer. Here's an attempt at a visualization, containing a few implementation details. Before splitting:
Underlying buffer, ref count 1
┌────────────────────────────────┐
│0123456789ABCDEFGHIJ │
└▲───────────────────────────────┘
│
│ first
│ ┌────────────┐
├──┤ptr │
│ │len: 20 │
│ │cap: 32 │
└──┤data │
└────────────┘
After calling let second = first.split_off(10), we will get
Underlying buffer, ref count 2
┌────────────────────────────────┐
│0123456789ABCDEFGHIJ │
└▲─────────▲─────────────────────┘
│ │
│ └──────────┐
│ first │ second
│ ┌────────────┐ │ ┌────────────┐
├──┤ptr │ └─────┤ptr │
│ │len: 10 │ │len: 10 │
│ │cap: 10 │ │cap: 22 │
├──┤data │ ┌─────┤data │
│ └────────────┘ │ └────────────┘
│ │
└────────────────────┘
Once we drop first, we will have
Underlying buffer, ref count 1
┌────────────────────────────────┐
│0123456789ABCDEFGHIJ │
└▲─────────▲─────────────────────┘
│ │
│ │
│ │ second
│ │ ┌────────────┐
│ └────────────────┤ptr │
│ │len: 10 │
│ │cap: 22 │
└──────────────────────────┤data │
└────────────┘
If you now call second.reserve(10), or call an operation that implicitly reserves, like writing more than fits in the current capacity, the BytesMut implementation can detect that second actually owns its underlying buffer, since the reference count is one. The implementation then may be able to reuse spare capacity in the buffer by moving the existing buffer contents to the beginning, so after second.reserve(20), the result could look like this:
Underlying buffer, ref count 1
┌────────────────────────────────┐
│ABCDEFGHIJ │
└▲───────────────────────────────┘
│
│ second
│ ┌────────────┐
├──┤ptr │
│ │len: 10 │
│ │cap: 32 │
└──┤data │
└────────────┘
However, the conditions for this optimization to be applied are not guaranteed. The documentation for reserve() states (emphasis mine)
Before allocating new buffer space, the function will attempt to reclaim space in the existing buffer. If the current handle references a view into a larger original buffer, and all other handles referencing part of the same original buffer have been dropped, then the current view can be copied/shifted to the front of the buffer and the handle can take ownership of the full buffer, provided that the full buffer is large enough to fit the requested additional capacity.
This optimization will only happen if shifting the data from the current view to the front of the buffer is not too expensive in terms of the (amortized) time required. The precise condition is subject to change; as of now, the length of the data being shifted needs to be at least as large as the distance that it’s shifted by. If the current view is empty and the original buffer is large enough to fit the requested additional capacity, then reallocations will never happen.
In summary, this optimization is only guaranteed if the reference count is one and the view is empty. This is the case in your example, so your code is guaranteed to reuse the buffer.
According to the documentation, BytesMut::split 'Removes the bytes from the current view, returning them in a new BytesMut handle.
Afterwards, self will be empty, but will retain any additional capacity that it had before the operation.'
This is done by creating a new BytesMut (which is then owned by front) that contains exactly the items of bm, after which bm is modified such that it contains only the remaining empty capacity. This way, BytesMut::split doesn't allocate any new memory.
You then drop the BytesMut (owned by front), making it so that there is no view into the memory owned by the backing Vec, from the start of the Vec until the start of bm. When you then put, the implementation first checks if there is enough space before the view of bm, but still inside the backing Vec, and tries to store the data there.
Because the amount of memory you put is the same as the memory 'freed' by dropping front, the implementation is able to store that data before the view of bm, and no data is allocated.
Related
I have a Terraform module calling a submodule, which also calls another submodule. The final module uses a ternary condition as part of some logic to determine whether a dynamic block should be omitted in a resource definition.
I'm going to only include the pertinent code here, else it would get unnecessarily complicated.
The first module call:
module "foobar" {
source = "./modules/foobar"
...
vpc_cidr = "10.0.0.0/16"
# or vpc_cidr = null, or omitted altogether as the default value is null
...
}
The second module (in "./modules/foobar"):
module "second_level" {
source = "./modules/second_level"
...
vpc_config = var.vpc_cidr == null ? {} : { "some" = "things }
...
}
The third module (in "./modules/second_level"):
locals {
vpc_config = var.vpc_config == {} ? {} : { this = var.vpc_config }
}
resource "aws_lambda_function" "this" {
...
dynamic "vpc_config" {
for_each = local.vpc_config
content {
"some" = vpc_config.value["some"]
}
...
}
This is all horribly simplified, as I'm sure you're already aware, and you might have some questions about why I'm doing things like in the second level ternary operator. I can only say that there are "reasons", but they'd detract from my question.
When I run this, I expect the dynamic block to be filled when the value of vpc_cidr is not null. When I run it with a value in vpc_cidr, it works, and the dynamic block is added.
If vpc_cidr is null however, I get an error like this:
│ 32: security_group_ids = vpc_config.value["some"]
│ ├────────────────
│ │ vpc_config.value is empty map of dynamic
The really odd this is that if I swap the ternary around so it's actually the reverse of what I want, like this: vpc_config = var.vpc_config == {} ? { this = var.vpc_config } : {} everything works as I want.
EDIT
Some more context after the correct answer, because what I'm asking for indeed looks strange.
Wrapping this map into another single-element map with a hard-coded key if it's not empty
I was originally doing this because I needed to iterate just once over the map in the for_each block (and it contains more than a single key), so I'm faking a single key by putting a dummy key in there to iterate over.
As #martin-atkins points out in the answer though, for_each can iterate over any collection type. Therefore, I've simplified the locals assignment like this:
locals {
vpc_config = length(var.vpc_config) == 0 ? [] : [var.vpc_config]
}
This means that I can run a more direct dynamic block, and do what I really want, which is iterate over a list:
dynamic "vpc_config" {
for_each = local.vpc_config
content {
subnet_ids = var.vpc_config["subnet_ids"]
security_group_ids = var.vpc_config["security_group_ids"]
}
}
It's still a little hacky because I'm converting a map to a list of maps, but it makes sense more sense further up the chain of modules.
Using the == operator to compare complex types is very rarely what you want, because == means "exactly the same type and value", and so unlike many other contexts is suddenly becomes very important to pay attention to the difference between object types and map types, map types of different element types, etc.
The expression {} has type object({}), and so a value of that type can never compare equal to a map(string) value, even if that map is empty. Normally the distinction between object types and map types is ignorable because Terraform will automatically convert between them, but the == operator doesn't give Terraform any information about what types you mean and so no automatic conversions are possible and you must get the types of the operands right yourself.
The easiest answer to avoid dealing with that is to skip using == at all and instead just use the length of the collection as the condition:
vpc_config = length(var.vpc_config) == 0 ? {} : { this = var.vpc_config }
Wrapping this map into another single-element map with a hard-coded key if it's not empty seems like an unusual thing to be doing, and so I wonder if this might be an XY Problem and there might be a more straightforward way to achieve your goal here, but I've focused on directly answering your question as stated.
You might find it interesting to know that the for_each argument in a dynamic block can accept any collection type, so (unlike for resource for_each, where the instance keys are significant for tracking) you shouldn't typically need to create synthetic extra maps to fake conditional blocks. A zero-or-one-element list would work just as well for generating zero or one blocks, for example.
All of your code is behaving as expected. The issue here is that the dynamic block iterator is likely not being lazily evaluated at compilation, but rather only at runtime. We can workaround this by providing a "failover" value to resolve against for the situation when vpc_config.value is empty, and therefore has no some key.
content {
"some" = try(vpc_config.value["some"], null)
}
Since we do not know the specifics, we have to assume it is safe to supply a null argument to the some parameter.
I have a computationally expensive vector I want to index into inside a function, but since the table is never used anywhere else, I don't want to pass the vector around, but access the precomputed values like a memoized function.
The idea is:
cachedFunction :: Int -> Int
cachedFunction ix = table ! ix
where table = <vector creation>
One aspect I've noticed is that all memoization examples I've seen deal with recursion, where even if a table is used to memoize, values in the table depend on other values in the table. This is not in my case, where computed values are found using a trial-and-error approach but each element is independent from another.
How do I achieve the cached table in the function?
You had it almost right. The problem is, your example basically scoped like this:
┌────────────────────────────────┐
cachedFunction ix = │ table ! ix │
│where table = <vector creation> │
└────────────────────────────────┘
i.e. table is not shared between different ix. This is regardless of the fact that it happens to not depend on ix (which is obvious in this example, but not in general). Therefore it would not be useful to keep it in memory, and Haskell doesn't do it.
But you can change that by pulling the ix argument into the result with its associated where-block:
cachedFunction = \ix -> table ! ix
where table = <vector creation>
i.e.
┌────────────────────────────────┐
cachedFunction = │ \ix -> table ! ix │
│where table = <vector creation> │
└────────────────────────────────┘
or shorter,
cachedFunction = (<vector creation> !)
In this form, cachedFunction is a constant applicative form, i.e. despite having a function type it is treated by the compiler as a constant value. It's not a value you could ever evaluate to normal form, but it will keep the same table around (which can't depend on ix; it doesn't have it in scope) when using it for evaluating the lambda function inside.
According to this answer, GHC will never recompute values declared at the top-level of a module. So by moving your table up to the top-level of your module, it will be evaluated lazily (once) the first time it's ever needed, and then it will never be requested again. We can see the behavior directly with Debug.Trace (example uses a simple integer rather than a vector, for simplicity)
import Debug.Trace
cachedFunction :: Int -> Int
cachedFunction ix = table + ix
table = traceShow "Computed" 0
main :: IO ()
main = do
print 0
print $ cachedFunction 1
print $ cachedFunction 2
Outputs:
0
"Computed"
1
2
We see that table is not computed until cachedFunction is called, and it's only computed once, even though we call cachedFunction twice.
This is a dumb question, so I apologise if so. This is for Julia, but I guess the question is not language specific.
There is advice in Julia that global variables should not be used in functions, but there is a case where I am not sure if a variable is global or local. I have a variable defined in a function, but is global for a nested function. For example, in the following,
a=2;
f(x)=a*x;
variable a is considered global. However, if we were to wrap this all in another function, would a still be considered global for f? For example,
function g(a)
f(x)=a*x;
end
We don't use a as an input for f, so it's global in that sense, but its still only defined in the scope of g, so is local in that sense. I am not sure. Thank you.
You can check directly that what #DNF commented indeed is the case (i.e. that the variable a is captured in a closure).
Here is the code:
julia> function g(a)
f(x)=a*x
end
g (generic function with 1 method)
julia> v = g(2)
(::var"#f#1"{Int64}) (generic function with 1 method)
julia> dump(v)
f (function of type var"#f#1"{Int64})
a: Int64 2
In this example your function g returns a function. I bind a v variable to the returned function to be able to inspect it.
If you dump the value bound to the v variable you can see that the a variable is stored in the closure.
A variable stored in a closure should not a problem for performance of your code. This is a typical pattern used e.g. when doing optimization of some function conditional on some parameter (captured in a closure).
As you can see in this code:
julia> #code_warntype v(10)
MethodInstance for (::var"#f#1"{Int64})(::Int64)
from (::var"#f#1")(x) in Main at REPL[1]:2
Arguments
#self#::var"#f#1"{Int64}
x::Int64
Body::Int64
1 ─ %1 = Core.getfield(#self#, :a)::Int64
│ %2 = (%1 * x)::Int64
└── return %2
everything is type stable so such code is fast.
There are some situations though in which boxing happens (they should be rare; they happen in cases when your function is so complex that the compiler is not able to prove that boxing is not needed; most of the time it happens if you assign value to the variable captured in a closure):
julia> function foo()
x::Int = 1
return bar() = (x = 1; x)
end
foo (generic function with 1 method)
julia> dump(foo())
bar (function of type var"#bar#6")
x: Core.Box
contents: Int64 1
julia> #code_warntype foo()()
MethodInstance for (::var"#bar#1")()
from (::var"#bar#1")() in Main at REPL[1]:3
Arguments
#self#::var"#bar#1"
Locals
x::Union{}
Body::Int64
1 ─ %1 = Core.getfield(#self#, :x)::Core.Box
│ %2 = Base.convert(Main.Int, 1)::Core.Const(1)
│ %3 = Core.typeassert(%2, Main.Int)::Core.Const(1)
│ Core.setfield!(%1, :contents, %3)
│ %5 = Core.getfield(#self#, :x)::Core.Box
│ %6 = Core.isdefined(%5, :contents)::Bool
└── goto #3 if not %6
2 ─ goto #4
3 ─ Core.NewvarNode(:(x))
└── x
4 ┄ %11 = Core.getfield(%5, :contents)::Any
│ %12 = Core.typeassert(%11, Main.Int)::Int64
└── return %12
I came up with an incorrect J verb in my head, which would find the proportion of redundant letters in a string. I started with just a bunch of verbs with no precedence defined, and tried grouping inwards:
c=. 'cool' NB. The test data string, 1/4 is redundant.
box =. 5!:2 NB. The verb to show the structure of another verb in a box.
p=.%#~.%# NB. First attempt. Meant to read "inverse of (tally of unique divided by tally)".
box < 'p'
┌─┬─┬────────┐
│%│#│┌──┬─┬─┐│
│ │ ││~.│%│#││
│ │ │└──┴─┴─┘│
└─┴─┴────────┘
p2=.%(#~.%#) NB. The first tally is meant to be in there with the nub sieve, so paren everything after the inverse monad.
box < 'p2'
┌─┬────────────┐
│%│┌─┬────────┐│
│ ││#│┌──┬─┬─┐││
│ ││ ││~.│%│#│││
│ ││ │└──┴─┴─┘││
│ │└─┴────────┘│
└─┴────────────┘
p3=. %((#~.)%#) NB. The first tally is still not grouped with the nub sieve, so paren the two together directly.
box < 'p3'
┌─┬────────────┐
│%│┌──────┬─┬─┐│
│ ││┌─┬──┐│%│#││
│ │││#│~.││ │ ││
│ ││└─┴──┘│ │ ││
│ │└──────┴─┴─┘│
└─┴────────────┘
p3 c NB. Looks about right, so test it!
|length error: p3
| p3 c
(#~.)c NB. Unexpected error, but I guessed as to what part had the problem.
|length error
| (#~.)c
My question is, why did my approach to grouping fail with this length error, and how should I have grouped it to get the desired effect?
(I assume it is something to do with turning it into a hook instead of grouping, or it just not realising it needs to use the monad forms, but I don't know how to verify or get around it if so.)
Fork and compose.
(# ~.) is a hook. This is probably what you're not expecting. (# ~.) 'cool' is applying ~. to 'cool' to give you 'col'. But as it's a monadic hook, it is then attempting 'cool' # 'col', which isn't what you're intending and which gives a length error.
To get 0.25 as the ratio of redundant characters in a string, don't use the reciprocal (%). You just subtract from 1 the ratio of unique characters. This is pretty straightforward with a fork:
(1 - #&~. % #) 'cool'
0.25
p9 =. 1 - #&~. % #
box < 'p9'
┌─┬─┬──────────────┐
│1│-│┌────────┬─┬─┐│
│ │ ││┌─┬─┬──┐│%│#││
│ │ │││#│&│~.││ │ ││
│ │ ││└─┴─┴──┘│ │ ││
│ │ │└────────┴─┴─┘│
└─┴─┴──────────────┘
Compose (&) ensures that you tally (#) the nub (~.) together, so that the fork grabs it as a single verb. The fork is a series of three verbs that applies the first and third verb, and then applies the middle verb to the results. So #&~. % # is the fork, where #&~. is applied to the string, resulting in 3. # is applied, resulting in 4. And then % is applied to those results, as 3 % 4, giving you 0.75. That's our ratio of unique characters.
1 - is just to get us 0.25 instead of 0.75. % 0.75 is the same as 1 % 0.75, which gives you 1.33333.
I'm going through the data structures chapter in The Algorithm Design Manual and came across Suffix Trees.
The example states:
Input:
XYZXYZ$
YZXYZ$
ZXYZ$
XYZ$
YZ$
Z$
$
Output:
I'm not able to understand how that tree gets generated from the given input string. Suffix trees are used to find a given Substring in a given String, but how does the given tree help towards that? I do understand another given example of a trie shown below, but if the below trie gets compacted to a suffix tree, then what would it look like?
The standard efficient algorithms for constructing a suffix tree are definitely nontrivial. The main algorithm for doing so is called Ukkonen's algorithm and is a modification of the naive algorithm with two extra optmizations. You are probably best off reading this earlier question for details on how to build it.
You can construct suffix trees by using the standard insertion algorithms on radix tries to insert each suffix into the tree, but doing so wlil take time O(n2), which can be expensive for large strings.
As for doing fast substring searching, remember that a suffix tree is a compressed trie of all the suffixes of the original string (plus some special end-of-string marker). If a string S is a substring of the initial string T and you had a trie of all the suffixes of T, then you could just do a search to see if T is a prefix of any of the strings in that trie. If so, then T must be a substring of S, since all its characters exist in sequence somewhere in T. The suffix tree substring search algorithm is precisely this search applied to the compressed trie, where you follow the appropriate edges at each step.
Hope this helps!
I'm not able to understand how that tree gets generated from the given input string.
You essentially create a patricia trie with all the suffixes you've listed. When inserting into a patricia trie, you search the root for a child starting with the first char from the input string, if it exists you continue down the tree but if it doesn't then you create a new node off the root. The root will have as many children as unique characters in the input string ($, a, e, h, i, n, r, s, t, w). You can continue that process for each character in the input string.
Suffix trees are used to find a given Substring in a given String, but how does the given tree help towards that?
If you are looking for substring "hen" then start searching from the root for a child which starts with "h". If the length of the string of in child "h" then continue to process child "h" until you've come to the end of the string or you get a mismatch of characters in input string and child "h" string. If you match all of child "h", i.e. input "hen" matches "he" in child "h" then move on to the children of "h" until you get to "n", if it fail to find a child beginning with "n" then the substring doesn't exist.
Compact Suffix Trie code:
└── (black)
├── (white) as
├── (white) e
│ ├── (white) eir
│ ├── (white) en
│ └── (white) ere
├── (white) he
│ ├── (white) heir
│ ├── (white) hen
│ └── (white) here
├── (white) ir
├── (white) n
├── (white) r
│ └── (white) re
├── (white) s
├── (white) the
│ ├── (white) their
│ └── (white) there
└── (black) w
├── (white) was
└── (white) when
Suffix Tree code:
String = the$their$there$was$when$
End of word character = $
└── (0)
├── (22) $
├── (25) as$
├── (9) e
│ ├── (10) ir$
│ ├── (32) n$
│ └── (17) re$
├── (7) he
│ ├── (2) $
│ ├── (8) ir$
│ ├── (31) n$
│ └── (16) re$
├── (11) ir$
├── (33) n$
├── (18) r
│ ├── (12) $
│ └── (19) e$
├── (26) s$
├── (5) the
│ ├── (1) $
│ ├── (6) ir$
│ └── (15) re$
└── (29) w
├── (24) as$
└── (30) hen$
A suffix tree basically just compacts runs of letters together when there are no choices to be made. For example, if you look at the right side of the trie in your question, after you've seen a w, there are really only two choices: was and when. In the trie, the as in was and the hen in when each still have one node for each letter. In a suffix tree, you'd put those together into two nodes holding as and hen, so the right side of your trie would turn into: