Identifying an existing folder [duplicate] - linux

This question already has answers here:
Expand tilde to home directory
(6 answers)
reader.ReadString does not strip out the first occurrence of delim
(4 answers)
Closed 3 years ago.
I have an issue where it seems go is telling me that a folder doesn't exist, when it clearly does.
path, _ := reader.ReadString('\n')
path, err := expand(path)
fmt.Println("Path Expanded: ", path, err)
if err == nil {
if _, err2 := os.Lstat(path); err2 == nil {
fmt.Println("Valid Path")
} else if os.IsNotExist(err2) {
fmt.Println("Invalid Path")
fmt.Println(err2)
} else {
fmt.Println(err2)
}
}
The expand function simply translates the ~ to the homeDir.
func expand(path string) (string, error) {
if len(path) == 0 || path[0] != '~' {
return path, nil
}
usr, err := user.Current()
if err != nil {
return "", err
}
return filepath.Join(usr.HomeDir, path[1:]), nil
}
If I input the value of ~ it correctly translates it to /home/<user>/ but it ultimately states that the folder does not exist, even though it clearly does, and I know I have access to it, so it doesn't seem to be a permissions thing.
if I try /root/ as the input, I correctly get a permissions error, I am ok with that. But I expect my ~ directory to return "Valid Path"
My error is almost always : no such file or directory
I am on Lubuntu 19.xx and it is a fairly fresh install, I am running this app from ~/Projects/src/Playground/AppName and I am using the bash terminal from vscode.
I have also tried both Lstat and Stat unsuccessfully, not to mention a ton of examples and different ways. I am sure this is some underlying linux thing that I don't understand...

The answer to this is that I was not trimming the ReadString which used the delimiter of \n, by adding strings.Trim(path, "\n"), it corrected my issue.

Related

Get last n files in directory sorted by timestamp without listing all files

I am trying to get get last N files from a directory sorted by Creation/Modification time.
I am currently using this code:
files, err := ioutil.ReadDir(path)
if err != nil {
return 0, err
}
sort.Slice(files, func(i, j int) bool {
return files[i].ModTime().Before(files[j].ModTime())
})
The problem here is that the expected amount of files in this directory is ~ 2mil and when I get all of them in a slice, it consumes a lot of memory ~ 800mb. Also it is not sure when the GC will clean the memory.
Is there other way where I can get the last N files in the directory sorted by ts without reading and consuming all of the files in the memory?
My first answer using filepath.Walk was still allocating a huge chunk of memory as #Marc pointed out. So here an improved algorithm.
Note: This is not an optimized algorithm. It's just about providing an idea on how takle the problem.
maxFiles := 5
batch := 100 // optimize to find good balance
dir, err := os.Open(path)
if err != nil {
log.Fatal(err)
}
var files []os.FileInfo
for {
fs, err := dir.Readdir(batch)
if err != nil {
log.Println(err)
break
}
for _, fileInfo := range fs {
var lastFile os.FileInfo
if maxFiles <= len(files) {
lastFile = files[len(files)-1]
}
if lastFile != nil && fileInfo.ModTime().After(lastFile.ModTime()) {
break
}
files = append(files, fileInfo)
sort.Slice(files, func(i, j int) bool {
return files[i].ModTime().Before(files[j].ModTime())
})
if maxFiles < len(files) {
files = files[:maxFiles]
}
break
}
}
The basic idea is to only keep the oldest max X files in memory and discard the newer ones immediately or as soon as an older file pushes them out of the list.
Instead of a slice it might be helpful to look into using a btree (as it is sorted internally) or a double linked list. You'll have to do some benchmarking to figure out what is optimal.

Remove all characters after a delimiter in a string

I am building a web crawler application in golang.
After downloading the HTML of a page, I separate out the URLs.
I am presented with URLs that have "#s" in them, such as "en.wikipedia.org/wiki/Race_condition#Computing". I would like to get rid of all characters following the "#", since these lead to the same page anyways. Any advice for how to do so?
Use the url package:
u, _ := url.Parse("SOME_URL_HERE")
u.Fragment = ""
return u.String()
An improvement on the answer by Luke Joshua Park is to parse the URL relative to the URL of the source page. This creates an absolute URL from what might be relative URL on the page (scheme not specified, host not specified, relative path). Another improvement is to check and handle errors.
func clean(pageURL, linkURL string) (string, error) {
p, err := url.Parse(pageURL)
if err != nil {
return "", err
}
l, err := p.Parse(linkURL)
if err != nil {
return "", err
}
l.Fragment = "" // chop off the fragment
return l.String()
}
If you are not interested in getting an absolute URL, then chop off everything after the #. This works because the only valid use of # in a URL is the fragment separator.
func clean(linkURL string) string {
i := strings.LastIndexByte(linkURL, '#')
if i < 0 {
return linkURL
}
return linkURL[:i]
}

Handling Dynamic Errors In Go (Specifically database/sql Package)

Using the database/sql package in go for things like sql.Exec will return dynamically generated, unreferenced errors such as
"Error 1062: Duplicate entry '192' for key 'id'"
The problem is that it can also return errors such as
"Error 1146: Table 'tbl' doesn't exist"
From the same call to sql.Exec
How can I tell the difference between these two errors without
String comparison, or
Pattern matching for error code
Or are those idiomatic viable solutions for this problem?
database/sql package does not solve this problem. It's driver specific. For example, for mysql you can use:
if mysqlError, ok := err.(*mysql.MySQLError); ok {
if mysqlError.Number == 1146 {
//handling
}
}
Also, you can find some error constant package, like mysqlerr from VividCortex, and use it:
if mysqlError, ok := err.(*mysql.MySQLError); ok {
if mysqlError.Number == mysqlerr.ER_NO_SUCH_TABLE {
//handling
}
}
It's not much better than pattern matching, but seems to be more idiomatic.
I think there's no idiomatic solution, but I wrote a simple function for getting the error number, so you can easily compare them.
In this solution I assume that the construction of the error message is always the same: "Error -some number here-: Error description".
If there's no number in the error or something went wrong it returns 0.
func ErrorCode(e error) int {
err := e.Error() //the description of the error
if len(err) < 6 { //if its too small return 0
return 0
}
i := 6 //Skip the part "Error "
for ; len(err) > i && unicode.IsDigit(rune(err[i])); i++ {
} // Raising i until we reach the end of err or we reach the end of error code
n, e := strconv.Atoi(string(err[6:i])) //convert it to int
if e != nil {
return 0 //something went wrong
}
return n //return the error code
}
Go playground link: http://play.golang.org/p/xqhVycsuyI

How to check file existence by its base name (without extension)?

Question is quite self-explanatory.
Please, could anybody show me how can I check existence of the file by name (without extension) by short and efficient way. It would be great if code returns several occurrence if folder have several files with the same name.
Example:
folder/
file.html
file.md
UPDATE:
It is not obviously how to use one of filepath.Match() or filepath.Glob() functions by official documentation. So here is some examples:
matches, _ := filepath.Glob("./folder/file*") //returns paths to real files [folder/file.html, folder/file.md]
matchesToPattern, _ := filepath.Match("./folder/file*", "./folder/file.html") //returns true, but it is just compare strings and doesn't check real content
You need to use the path/filepath package.
The functions to check are: Glob(), Match() and Walk() — pick whatever suits your taste better.
Here is the updated code :
package main
import (
"fmt"
"os"
"path/filepath"
"regexp"
)
func main() {
dirname := "." + string(filepath.Separator)
d, err := os.Open(dirname)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
defer d.Close()
fi, err := d.Readdir(-1)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
r, _ := regexp.Compile("f([a-z]+)le") // the string to match
for _, fi := range fi {
if fi.Mode().IsRegular() { // is file
if r.Match([]byte(fi.Name())) { // if it match
fmt.Println(fi.Name(), fi.Size(), "bytes")
}
}
}
}
With this one you can also search for date, size, include subfolders or file properties.

Why file's name get messy using archive/zip in golang, linux?

I'm using golang's standard package archive/zip to wrap several files into a zipfile.
Here is my code for test:
package main
import (
"archive/zip"
"log"
"os"
)
func main() {
archive, _ := os.Create("/tmp/测试file.zip")
w := zip.NewWriter(archive)
// Add some files to the archive.
var files = []struct {
Name, Body string
}{
{"测试.txt", "test content: 测试"},
{"test.txt", "test content: test"},
}
for _, file := range files {
f, err := w.Create(file.Name)
if err != nil {
log.Fatal(err)
}
_, err = f.Write([]byte(file.Body))
if err != nil {
log.Fatal(err)
}
}
err := w.Close()
if err != nil {
log.Fatal(err)
}
}
results:
I get a zip file named 测试file.zip under /tmp as expected.
After unzip it, I get two files: test.txt, ц╡ЛшпХ.txt, and that is a mess.
The contents in both of the two files are normal as expected.
Why does this happen and how to fix this?
This might be an issue with unzip not handling UTF8 names properly. Explicitly using the Chinese locale worked for me:
$ LANG=zh_ZH unzip 测试file.zip
Archive: 测试file.zip
inflating: 测试.txt
inflating: test.txt
$ cat *.txt
test content: testtest content: 测试
import {
"golang.org/x/text/encoding/simplifiedchinese"
"golang.org/x/text/transform"
}
filename, _, err = transform.String(simplifiedchinese.GBK.NewEncoder(), "测试.txt")

Resources